Resolution: Won't Fix
Affects Version/s: 5.4.1
Fix Version/s: None
alpine linux v3.3
I'm using ElasticSearch (v2.2.0 , Lucene v5.4.1) and it's Pattern Replace Char Filter (Lucenes PatternReplaceCharFilter) . I need to filter out urls from my query text before it is tokenised. But I found that some input strings cause ElasticSearch to "hang" (slowly eating more CPU and memory) until the system crashes.
I pasted the regex and the attack string into https://regex101.com
- Test string:
https://regex101.com shows the problem to be "Catastrophic backtracking"
Catastrophic backtracking has been detected and the execution of your expression has been halted. To find out more what this is, please read the following article: Runaway Regular Expressions.
It would be great if Lucene could detect "Catastrophic backtracking" and throw a error or return null.
As an aside, I created a unit test for our PHP application that uses the same regexp and test string. (PHP can understand the same regexp, even though it's obviously for Java in the ElasticSearch case) . Interestingly in php, the regex results in `null` which is the documented response of preg_replace when a error occurs. If PHP can return a error rather than crashing - surely Lucene / Java can too :trollface: ?
(I originally opened a ticket to the ElasticSearch project but got told opening it here would be more appropriate - sorry if I'm wrong)