Lucene - Core
  1. Lucene - Core
  2. LUCENE-2756

MultiSearcher.rewrite() incorrectly rewrites queries

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.1, 4.0-ALPHA
    • Component/s: core/search
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      This was reported on the userlist, in the context of range queries.

      Its also easy to make our existing tests fail with my patch on LUCENE-2751:

      ant test-core -Dtestcase=TestBoolean2 -Dtestmethod=testRandomQueries -Dtests.seed=7679849347282878725:-903778383189134045
      

      The fundamental problem is that MultiSearcher first rewrites against individual subs,
      then uses Query.combine() which simply OR's these sub-clauses.

      This is incorrect for expanded MUST_NOT queries (e.g. from wildcard), as it violates demorgan's law.

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Resolved Resolved
          59d 11h 21m 1 Michael McCandless 11/Jan/11 00:22
          Resolved Resolved Closed Closed
          78d 15h 27m 1 Grant Ingersoll 30/Mar/11 16:50
          Uwe Schindler made changes -
          Link This issue is duplicated by LUCENE-3096 [ LUCENE-3096 ]
          Grant Ingersoll made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Hide
          Grant Ingersoll added a comment -

          Bulk close for 3.1

          Show
          Grant Ingersoll added a comment - Bulk close for 3.1
          Mark Thomas made changes -
          Workflow Default workflow, editable Closed status [ 12564375 ] jira [ 12584887 ]
          Mark Thomas made changes -
          Workflow jira [ 12526680 ] Default workflow, editable Closed status [ 12564375 ]
          Michael McCandless made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Fix Version/s 3.1 [ 12314822 ]
          Fix Version/s 4.0 [ 12314025 ]
          Resolution Fixed [ 1 ]
          Hide
          Michael McCandless added a comment -

          MultiSearcher is now deprecated/removed.

          Show
          Michael McCandless added a comment - MultiSearcher is now deprecated/removed.
          Uwe Schindler made changes -
          Link This issue is superceded by LUCENE-2837 [ LUCENE-2837 ]
          Robert Muir made changes -
          Field Original Value New Value
          Attachment LUCENE-2756_testcase.patch [ 12459447 ]
          Hide
          Robert Muir added a comment -

          attached is a simple test, it adds a single document "foo bar" to one index,
          and another document "foo baz" to another.

          if you do the query "+foo -ba*", the multisearcher rewrites this to:
          (+field:foo -field:baz) (+field:foo -field:bar)

          This causes both documents to match the query, when really neither should.
          instead the query should be (+field:foo -field:baz -field:bar)

          if you run the test with -Dtests.verbose=true you can see the rewritten form.

          the reason this only appeared with a certain document count for the issue on the
          user's list is because they were using CONSTANT_SCORE_AUTO and with that
          document count it was deciding to use a constant-score boolean rewrite method.

          Show
          Robert Muir added a comment - attached is a simple test, it adds a single document "foo bar" to one index, and another document "foo baz" to another. if you do the query "+foo -ba*", the multisearcher rewrites this to: (+field:foo -field:baz) (+field:foo -field:bar) This causes both documents to match the query, when really neither should. instead the query should be (+field:foo -field:baz -field:bar) if you run the test with -Dtests.verbose=true you can see the rewritten form. the reason this only appeared with a certain document count for the issue on the user's list is because they were using CONSTANT_SCORE_AUTO and with that document count it was deciding to use a constant-score boolean rewrite method.
          Robert Muir created issue -

            People

            • Assignee:
              Unassigned
              Reporter:
              Robert Muir
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development