Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Not A Problem
-
9.0
-
None
-
None
-
New
Description
Class org.apache.lucene.util.automaton.RegExp fails to parse valid regular expressions that contain double quotes (except in character classes). This of course affects corresponding RegexpQuerys, as well.
Example:
Query q = new RegexpQuery( new Term( "field", "a\"b" ) ); RegExp r = new RegExp( "a\"b" );
Both fail with:
java.lang.IllegalArgumentException: expected '"' at position 3
at org.apache.lucene.util.automaton.RegExp.parseSimpleExp(RegExp.java:1299)
at org.apache.lucene.util.automaton.RegExp.parseCharClassExp(RegExp.java:1229)
at org.apache.lucene.util.automaton.RegExp.parseComplExp(RegExp.java:1218)
at org.apache.lucene.util.automaton.RegExp.parseRepeatExp(RegExp.java:1192)
at org.apache.lucene.util.automaton.RegExp.parseConcatExp(RegExp.java:1185)
at org.apache.lucene.util.automaton.RegExp.parseConcatExp(RegExp.java:1187)
at org.apache.lucene.util.automaton.RegExp.parseInterExp(RegExp.java:1179)
at org.apache.lucene.util.automaton.RegExp.parseUnionExp(RegExp.java:1173)
at org.apache.lucene.util.automaton.RegExp.<init>(RegExp.java:496)
...
As a workaround we currently replace all double quotes with a dot.