Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-10501

StackOverflow when RegExp encounters a very large string

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 9.1
    • None
    • core/queryparser
    • None
    • New

    Description

      When RegExp encounters a very large string, it hits a Stack Overflow exception when parsing it.

      Simple program to repro:

      $ ls
      RegExpTest.java       lucene-core-9.1.0.jar
      $ cat RegExpTest.java
      class RegExpTest {
          public static void main(String[] args) {
              StringBuilder strBuilder = new StringBuilder();
              for (int i = 0; i < 50000; i++) {
                  strBuilder.append("a");
              }
              try {
                  new org.apache.lucene.util.automaton.RegExp(strBuilder.toString());
              } catch (StackOverflowError e) {
                  System.out.println("Stack overflow");
                  System.exit(-1);
              }
              System.out.println("Success");
          }
      }
      $ javac -cp './lucene-core-9.1.0.jar:.' RegExpTest.java
      $ java -cp './lucene-core-9.1.0.jar:.' RegExpTest
      Stack overflow
      $ java -Xss1G -cp './lucene-core-9.1.0.jar:.' RegExpTest
      Success

      Based on https://issues.apache.org/jira/browse/LUCENE-6156 , this appears to be due to the recursive parsing implementation.

      Attachments

        Activity

          People

            Unassigned Unassigned
            kartg Kartik Ganesh
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: