Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-372

Unmatched right parentheses truncates query

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.4, 1.9, 2.0.0, 2.1
    • Fix Version/s: 2.2
    • Component/s: core/queryparser
    • Labels:
      None
    • Environment:

      Operating System: Linux
      Platform: PC

    • Bugzilla Id:
      34447
    • Lucene Fields:
      Patch Available

      Description

      The query processor truncates a query when right parentheses are unmatched.
      E.g.:

      secret AND illegal) AND access:confidential

      will not result in a ParseException instead will run as:

      secret AND illegal

      1. Lucene-372.patch
        5 kB
        Andreas Neumann
      2. LUCENE-372.patch.txt
        4 kB
        Steven Parkes

        Activity

        Hide
        neunand Andreas Neumann added a comment -

        The reason for this is that the parse() method does not ensure that the entire input string has been consumed. Query() will simply return if it encounters an unexpected token, relying on its caller to consume that token. If the query is nested, the call will throw an exception. But at the top-level, the caller is parse(), and it does not make sure that <EOF> is reached.

        A simple fix is to introdice a new nonterminal TopLevelQuery ::= Query <EOF>, and parse() calls TopLevelQuery. I have a patch for this, will post it together with a few new unit test cases for this.

        Andreas.

        Show
        neunand Andreas Neumann added a comment - The reason for this is that the parse() method does not ensure that the entire input string has been consumed. Query() will simply return if it encounters an unexpected token, relying on its caller to consume that token. If the query is nested, the call will throw an exception. But at the top-level, the caller is parse(), and it does not make sure that <EOF> is reached. A simple fix is to introdice a new nonterminal TopLevelQuery ::= Query <EOF>, and parse() calls TopLevelQuery. I have a patch for this, will post it together with a few new unit test cases for this. Andreas.
        Hide
        neunand Andreas Neumann added a comment -

        This patch contains 5 files (I was not sure whether to include the JavaCC-generated files, ended up including them):

        • QueryParser.jJ: introduced a new nonterminal TopLevelQuery
        • QueryParser.java: generated from QueryParser.jj by javaCC
        • QueryParserTokenManager.java: generated from QueryParser.jj by javaCC
        • TestQueryParser.java: rewrote testException and added a few more tests.
        • TestMultiFieldQueryParser.java: fixed testSimple() because it was passing in unbalanced quotation marks. That caused a failure with the new QueryParser.
        • Andreas.
        Show
        neunand Andreas Neumann added a comment - This patch contains 5 files (I was not sure whether to include the JavaCC-generated files, ended up including them): QueryParser.jJ: introduced a new nonterminal TopLevelQuery QueryParser.java: generated from QueryParser.jj by javaCC QueryParserTokenManager.java: generated from QueryParser.jj by javaCC TestQueryParser.java: rewrote testException and added a few more tests. TestMultiFieldQueryParser.java: fixed testSimple() because it was passing in unbalanced quotation marks. That caused a failure with the new QueryParser. Andreas.
        Hide
        steven_parkes Steven Parkes added a comment -

        Somehow the patch got some extra line feeds in some of the javacc created files. This version takes those out.

        Show
        steven_parkes Steven Parkes added a comment - Somehow the patch got some extra line feeds in some of the javacc created files. This version takes those out.
        Hide
        steven_parkes Steven Parkes added a comment -

        I like it. There are some extra tests that demonstrate where parse exceptions are going to be thrown, which is nice.

        It would be nice if we could, perhaps, be more liberal in what we accept. For example, I've always thought it would be nice if we could add in missing terminators, e.g., quotes, parentheses., in the spirit of "being liberal in what you accept." But I don't see that being too feasible with an automatically generated parser.

        I do think completely ignoring terms, which this patch fixes, is the wrong way to be liberal.

        Show
        steven_parkes Steven Parkes added a comment - I like it. There are some extra tests that demonstrate where parse exceptions are going to be thrown, which is nice. It would be nice if we could, perhaps, be more liberal in what we accept. For example, I've always thought it would be nice if we could add in missing terminators, e.g., quotes, parentheses., in the spirit of "being liberal in what you accept." But I don't see that being too feasible with an automatically generated parser. I do think completely ignoring terms, which this patch fixes, is the wrong way to be liberal.
        Hide
        michaelbusch Michael Busch added a comment -

        I just committed this. Thanks, Andreas!

        Show
        michaelbusch Michael Busch added a comment - I just committed this. Thanks, Andreas!

          People

          • Assignee:
            michaelbusch Michael Busch
            Reporter:
            patrick.hochstenbach@ugent.be Patrick Hochstenbach
          • Votes:
            2 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development