Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-15764

Extend validateSourcePatterns task to scan for LTR/RTL unicode to catch "Trojan Source" (see paper)

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • None
    • 9.0, 8.11
    • Build
    • None

    Description

      There was a paper published that describes, how a malicous code contributor can supply a patch which successfully compiles to code, but not code that the reviewing committer thinks it does. This comes from the fact that UIs like Github or your IDE apply left-to-right/right-to-left switching unicode sequences and so hiding code for the reviewer.

      See paper: https://trojansource.codes/trojan-source.pdf
      Home page: https://trojansource.codes/

      For source code it makes no sense to have LTR/RTL carachters. Compilers like GCC get updates soon, but I am not sure about Java.

      So I suggest to add the pattern of code points to validate source patterns task.

      For now I would only add the code points as described in the paper, but rmuir made the suggestion to exclude a large range. What the regex does not match is the other malicous patern like using visually similar characters to add hidden duplicate methods. The risk there is lower in my mind, unless somebody hides the "bad" method using the above LTR/RTL tricks.

      Attachments

        Issue Links

          Activity

            People

              uschindler Uwe Schindler
              uschindler Uwe Schindler
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: