Details
-
New Feature
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
None
-
None
-
New
Description
There was a paper published that describes, how a malicous code contributor can supply a patch which successfully compiles to code, but not code that the reviewing committer thinks it does. This comes from the fact that UIs like Github or your IDE apply left-to-right/right-to-left switching unicode sequences and so hiding code for the reviewer.
See paper: https://trojansource.codes/trojan-source.pdf
Home page: https://trojansource.codes/
For source code it makes no sense to have LTR/RTL carachters. Compilers like GCC get updates soon, but I am not sure about Java.
So I suggest to add the pattern of code points to validate source patterns task.
For now I would only add the code points as described in the paper, but rmuir made the suggestion to exclude a large range. What the regex does not match is the other malicous patern like using visually similar characters to add hidden duplicate methods. The risk there is lower in my mind, unless somebody hides the "bad" method using the above LTR/RTL tricks.
Attachments
Issue Links
- is cloned by
-
SOLR-15764 Extend validateSourcePatterns task to scan for LTR/RTL unicode to catch "Trojan Source" (see paper)
-
- Closed
-
- links to