Lucene - Core
  1. Lucene - Core
  2. LUCENE-4109

BooleanQueries are not parsed correctly with the flexible query parser

    Details

    • Lucene Fields:
      New

      Description

      Hi,

      I just found another bug in the flexible query parser (together with Robert Muir, yay!).

      The following query string works in the standard query parser:

      (field:[1 TO *] AND field:[* TO 2]) AND field2:z
      

      yields

      +(+field:[1 TO *] +field:[* TO 2]) +field2:z
      

      The flexible query parser though yields:

      +(field:[1 TO *] field:[* TO 2]) +field2:z
      

      Test patch is attached (from Robert actually).

      I don't know if it affects earlier versions than 3.5.

      1. LUCENE-4109_branches_lucene_solr_3_6_rev1367055.patch
        7 kB
        Karsten R.
      2. LUCENE-4109_trunk_rev1366771.patch
        86 kB
        Karsten R.
      3. LUCENE-4109.patch
        86 kB
        Robert Muir
      4. LUCENE-4109.patch
        16 kB
        Robert Muir
      5. LUCENE-4109.patch
        14 kB
        Karsten R.
      6. test-patch.txt
        3 kB
        Daniel Truemper

        Activity

        Hide
        Hoss Man added a comment -

        bulk cleanup of 4.0-ALPHA / 4.0 Jira versioning. all bulk edited issues have hoss20120711-bulk-40-change in a comment

        Show
        Hoss Man added a comment - bulk cleanup of 4.0-ALPHA / 4.0 Jira versioning. all bulk edited issues have hoss20120711-bulk-40-change in a comment
        Hide
        Karsten R. added a comment -

        other differences between classic QueryParser and flexible StandardQueryParser can be seen, if we run QueryParserTestBase against StandardQueryParser:

        • assertQueryEquals("\\* TO \"*\"",null,"\\* TO \\*");
          → result is {{ "\\* TO *" }} instead
        • {{ { assertQueryEquals("a:b\\-?c", a, "a:b\\-?c"); }

          }}
          → result is {{

          { "a:b-?c" }

          }} instead

        • assertQueryEquals("a - b", a, "a – b"); }}
          LUCENE-2566

        Is it possible to backport a solution to 3.x? I have a customer with this exact problem on 3.6...

        Show
        Karsten R. added a comment - other differences between classic QueryParser and flexible StandardQueryParser can be seen, if we run QueryParserTestBase against StandardQueryParser: assertQueryEquals(" \\* TO \"*\" ",null," \\* TO \\* "); → result is {{ " \\* TO * " }} instead {{ { assertQueryEquals("a:b\\-?c", a, "a:b\\-?c"); } }} → result is {{ { "a:b-?c" } }} instead assertQueryEquals("a - b", a, "a – b"); }} → LUCENE-2566 Is it possible to backport a solution to 3.x? I have a customer with this exact problem on 3.6...
        Hide
        Robert Muir added a comment -

        Karsten, porting LUCENE-2566 to the QP is not really related to this issue. Maybe leave a comment over there?

        This issue is a straight up bug. If someone can figure out whats broken, I'll fix it. But I dont know where
        the bug is!

        Show
        Robert Muir added a comment - Karsten, porting LUCENE-2566 to the QP is not really related to this issue. Maybe leave a comment over there? This issue is a straight up bug. If someone can figure out whats broken, I'll fix it. But I dont know where the bug is!
        Hide
        Karsten R. added a comment -

        Robert, the bug is in GroupQueryNodeProcessor:

        StandardQueryTreeBuilder does not care about AndQueryNode or OrQueryNode, it expects to find the correct ModifierQueryNode-children.
        GroupQueryNodeProcessor is responsible to insert this children (BooleanModifiersQueryNodeProcessor is not used in StandardQueryNodeProcessorPipeline).
        But GroupQueryNodeProcessor does not support a proper recursion: You can hide a GroupQueryNode as child of AndQueryNodes or NOT-Nodes or Boost-Nodes.
        So all in all this tests failed because "(a AND b)" is treated like "(a b)":

        • assertQueryEquals("!(a AND b) OR c", null, "-(+a +b) c");
        • assertQueryEquals("!(a AND b) AND c", null, "-(+a +b) +c");
        • assertQueryEquals("((a AND b) AND c)", null, "(+a +b) +c");
        • assertQueryEquals("(a AND b) AND c", null, "(+a +b) +c");
        • {{ {assertQueryEquals("!(a AND b)", null, "!(+a +b)");}

          }}

        • assertQueryEquals("(a AND b)^4 OR c", null, "(+a +b)^4 c");

        imho GroupQueryNodeProcessor should be divided in two processors. On processor to remove all GroupQueryNodes and one like BooleanModifiersQueryNodeProcessor to add ModifiersQueryNodes.
        btw: If GroupQueryNodeProcessor would extends QueryNodeProcessorImpl we could be sure about the recursion.

        Show
        Karsten R. added a comment - Robert, the bug is in GroupQueryNodeProcessor: StandardQueryTreeBuilder does not care about AndQueryNode or OrQueryNode, it expects to find the correct ModifierQueryNode-children. GroupQueryNodeProcessor is responsible to insert this children (BooleanModifiersQueryNodeProcessor is not used in StandardQueryNodeProcessorPipeline). But GroupQueryNodeProcessor does not support a proper recursion: You can hide a GroupQueryNode as child of AndQueryNodes or NOT-Nodes or Boost-Nodes. So all in all this tests failed because "(a AND b)" is treated like "(a b)": assertQueryEquals("!(a AND b) OR c", null, "-(+a +b) c"); assertQueryEquals("!(a AND b) AND c", null, "-(+a +b) +c"); assertQueryEquals("((a AND b) AND c)", null, "(+a +b) +c"); assertQueryEquals("(a AND b) AND c", null, "(+a +b) +c"); {{ {assertQueryEquals("!(a AND b)", null, "!(+a +b)");} }} assertQueryEquals("(a AND b)^4 OR c", null, "(+a +b)^4 c"); imho GroupQueryNodeProcessor should be divided in two processors. On processor to remove all GroupQueryNodes and one like BooleanModifiersQueryNodeProcessor to add ModifiersQueryNodes. btw: If GroupQueryNodeProcessor would extends QueryNodeProcessorImpl we could be sure about the recursion.
        Hide
        Robert Muir added a comment -

        Karsten, do you think you can make a patch to fix this?

        Show
        Robert Muir added a comment - Karsten, do you think you can make a patch to fix this?
        Hide
        Karsten R. added a comment -

        Robert, the current GroupQueryNodeProcessor minimize the amount of nodes in the SyntaxTree (via defaultOperator).
        I think this screw me up. I could make a patch without this optimization which passes TestQPHelper.
        But possible we need more TestCases. I would like to test against QueryParserTestBase, but this can not be extended for StandardQueryParser because of the defaultField, which moves from constructor to the parse-method and because there is no common interface for classic QP and StandardQP with e.g. the setAllowLeadingWildcard-Method.
        Would it be OK, if I also patch QueryParserTestBase?

        Show
        Karsten R. added a comment - Robert, the current GroupQueryNodeProcessor minimize the amount of nodes in the SyntaxTree (via defaultOperator). I think this screw me up. I could make a patch without this optimization which passes TestQPHelper. But possible we need more TestCases. I would like to test against QueryParserTestBase, but this can not be extended for StandardQueryParser because of the defaultField, which moves from constructor to the parse-method and because there is no common interface for classic QP and StandardQP with e.g. the setAllowLeadingWildcard-Method. Would it be OK, if I also patch QueryParserTestBase?
        Hide
        Robert Muir added a comment -

        Would it be OK, if I also patch QueryParserTestBase?

        Yes!

        Show
        Robert Muir added a comment - Would it be OK, if I also patch QueryParserTestBase? Yes!
        Hide
        Karsten R. added a comment - - edited

        Patch for lucene/contrib against http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6
        The patch adds the Processor BooleanQuery2ModifierNodeProcessor.
        The patch also changes ParametricRangeQueryNodeProcessor as hotfix for LUCENE-3338 (this change is not for 4.X because LUCENE-3338 is already fixed in 4.X).
        The patch passes most tests from QueryParserTestBase e.g. except
        {{

        {assertQueryEquals("[\\* TO \"*\"]",null,"[\\* TO \\*]");}

        }}
        and LUCENE-2566 related tests.
        Patch for trunk will coming soon.

        Show
        Karsten R. added a comment - - edited Patch for lucene/contrib against http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6 The patch adds the Processor BooleanQuery2ModifierNodeProcessor. The patch also changes ParametricRangeQueryNodeProcessor as hotfix for LUCENE-3338 (this change is not for 4.X because LUCENE-3338 is already fixed in 4.X). The patch passes most tests from QueryParserTestBase e.g. except {{ {assertQueryEquals("[\\* TO \"*\"]",null,"[\\* TO \\*]");} }} and LUCENE-2566 related tests. Patch for trunk will coming soon.
        Hide
        Robert Muir added a comment -

        Patch looks good to me!

        I also added Daniels test from buzzwords.

        Thanks for fixing this, and adding additional tests! Once 3.6 branch is open I'll get it in.

        Show
        Robert Muir added a comment - Patch looks good to me! I also added Daniels test from buzzwords. Thanks for fixing this, and adding additional tests! Once 3.6 branch is open I'll get it in.
        Hide
        Robert Muir added a comment -

        Hmm, with the patch some tests for TestMultiFieldQPHelper fail.
        I didn't look into it further, but we should figure out whats going on there if we can.

        Show
        Robert Muir added a comment - Hmm, with the patch some tests for TestMultiFieldQPHelper fail. I didn't look into it further, but we should figure out whats going on there if we can.
        Hide
        Karsten R. added a comment -

        Robert, I forgot to run all tests
        The patch must also include MultiFieldQueryNodeProcessor (new OrQueryNode(children) instead of new BooleanQueryNode(children)) and PrecedenceQueryNodeProcessorPipeline (BooleanQuery2ModifierNodeProcessor.class instead of GroupQueryNodeProcessor.class). I will fix this on monday.
        btw. I hope ((b:one +b:more) t:two) is equal to ((b:one +b:more) (+t:two))

        Show
        Karsten R. added a comment - Robert, I forgot to run all tests The patch must also include MultiFieldQueryNodeProcessor ( new OrQueryNode(children) instead of new BooleanQueryNode(children) ) and PrecedenceQueryNodeProcessorPipeline ( BooleanQuery2ModifierNodeProcessor.class instead of GroupQueryNodeProcessor.class ). I will fix this on monday. btw. I hope ((b:one +b:more) t:two) is equal to ((b:one +b:more) (+t:two))
        Hide
        Karsten R. added a comment -

        Patch against http://svn.apache.org/repos/asf/lucene/dev/trunk (rev1366771)
        The patch adds the processor BooleanQuery2ModifierNodeProcessor and a test-class TestStandardQP which extends QueryParserTestBase.
        There is also a new interface CommonQueryParserConfiguration to configure flexible and classic parser in QueryParserTestBase.
        In TestStandardQP some tests from QueryParserTestBase are hidden by a "do-nothing" overriding method.

        The patch passes all tests from /lucene/queryparser/src. It passes because some tests are changed (without changing any meaning). In particular TestMultiFieldQPHelper is changed.

        Still "\\* TO \"*\"" is parsed as "\\* TO *" and not as "\\* TO \\*".

        If this patch is OK I will provide a Version 3.6.X patch.

        Show
        Karsten R. added a comment - Patch against http://svn.apache.org/repos/asf/lucene/dev/trunk (rev1366771) The patch adds the processor BooleanQuery2ModifierNodeProcessor and a test-class TestStandardQP which extends QueryParserTestBase. There is also a new interface CommonQueryParserConfiguration to configure flexible and classic parser in QueryParserTestBase. In TestStandardQP some tests from QueryParserTestBase are hidden by a "do-nothing" overriding method. The patch passes all tests from /lucene/queryparser/src. It passes because some tests are changed (without changing any meaning). In particular TestMultiFieldQPHelper is changed. Still " \\* TO \"*\" " is parsed as " \\* TO * " and not as " \\* TO \\* ". If this patch is OK I will provide a Version 3.6.X patch.
        Hide
        Robert Muir added a comment -

        queryparser tests doing instanceof on MockAnalyzer and doing different things depending upon whether
        position increments are enabled in its stopfilter is wrong.

        Show
        Robert Muir added a comment - queryparser tests doing instanceof on MockAnalyzer and doing different things depending upon whether position increments are enabled in its stopfilter is wrong.
        Hide
        Robert Muir added a comment -

        updated patch. i removed the instanceof and getter from mockanalyzer.

        the bug is simply that the flexible queryparser didnt enable position increments by default.

        Show
        Robert Muir added a comment - updated patch. i removed the instanceof and getter from mockanalyzer. the bug is simply that the flexible queryparser didnt enable position increments by default.
        Hide
        Robert Muir added a comment -

        This is a great test refactoring. I think its ready for trunk and 4.x

        I'll list in CHANGES that we defaulted positionIncrements=on like ClassicQueryParser.

        For 3.6.x i am not sure we should do this, so maybe we should use your hack

        I'll think about it.

        Show
        Robert Muir added a comment - This is a great test refactoring. I think its ready for trunk and 4.x I'll list in CHANGES that we defaulted positionIncrements=on like ClassicQueryParser. For 3.6.x i am not sure we should do this, so maybe we should use your hack I'll think about it.
        Hide
        Robert Muir added a comment -

        Thank you!

        For 3.6.2 I omitted the test refactoring, and just the fixes, only backporting a few individual tests. I think this would be too much for the bugfix branch.

        Show
        Robert Muir added a comment - Thank you! For 3.6.2 I omitted the test refactoring, and just the fixes, only backporting a few individual tests. I think this would be too much for the bugfix branch.
        Hide
        Karsten R. added a comment -

        Robert, thank you for backporting the patch to 3.6.2
        Please also commit this patch with a change of ParametricRangeQueryNodeProcessor because [1 TO *] is treated like [1 TO //*] (was not in trunk-patch because of LUCENE-3338)

        Show
        Karsten R. added a comment - Robert, thank you for backporting the patch to 3.6.2 Please also commit this patch with a change of ParametricRangeQueryNodeProcessor because [1 TO *] is treated like [1 TO //*] (was not in trunk-patch because of LUCENE-3338 )
        Hide
        Robert Muir added a comment -

        Karsten: can you please open a new issue for adding open range query feature?

        I don't want to mix that up with this bug.

        Show
        Robert Muir added a comment - Karsten: can you please open a new issue for adding open range query feature? I don't want to mix that up with this bug.
        Hide
        Karsten R. added a comment -

        Robert: the patch solves the "Flexible query parser does not support open ranges"-part of LUCENE-3338, it is not a new issue.
        Possible best "fix" is the backporting of LUCENE-3338 (because even with the patch the query [include TO exclude} throws a syntax error)

        Show
        Karsten R. added a comment - Robert: the patch solves the "Flexible query parser does not support open ranges"-part of LUCENE-3338 , it is not a new issue. Possible best "fix" is the backporting of LUCENE-3338 (because even with the patch the query [include TO exclude } throws a syntax error)
        Hide
        Robert Muir added a comment -

        But its not related to this issue at all.

        Show
        Robert Muir added a comment - But its not related to this issue at all.
        Hide
        Uwe Schindler added a comment -

        Closed after release.

        Show
        Uwe Schindler added a comment - Closed after release.

          People

          • Assignee:
            Robert Muir
            Reporter:
            Daniel Truemper
          • Votes:
            2 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development