Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-10430

Literal double quotes cause exception in class RegExp

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 9.0
    • None
    • core/other
    • None
    • New

    Description

      Class org.apache.lucene.util.automaton.RegExp fails to parse valid regular expressions that contain double quotes (except in character classes). This of course affects corresponding RegexpQuerys, as well.

      Example: 

      Query  q = new RegexpQuery( new Term( "field", "a\"b" ) );
      RegExp r = new RegExp( "a\"b" );

      Both fail with:

      java.lang.IllegalArgumentException: expected '"' at position 3
          at org.apache.lucene.util.automaton.RegExp.parseSimpleExp(RegExp.java:1299)
          at org.apache.lucene.util.automaton.RegExp.parseCharClassExp(RegExp.java:1229)
          at org.apache.lucene.util.automaton.RegExp.parseComplExp(RegExp.java:1218)
          at org.apache.lucene.util.automaton.RegExp.parseRepeatExp(RegExp.java:1192)
          at org.apache.lucene.util.automaton.RegExp.parseConcatExp(RegExp.java:1185)
          at org.apache.lucene.util.automaton.RegExp.parseConcatExp(RegExp.java:1187)
          at org.apache.lucene.util.automaton.RegExp.parseInterExp(RegExp.java:1179)
          at org.apache.lucene.util.automaton.RegExp.parseUnionExp(RegExp.java:1173)
          at org.apache.lucene.util.automaton.RegExp.<init>(RegExp.java:496)
          ...

      As a workaround we currently replace all double quotes with a dot.

      Attachments

        Activity

          People

            Unassigned Unassigned
            ickzon Holger Rehn
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: