Lucene - Core
  1. Lucene - Core
  2. LUCENE-2756

MultiSearcher.rewrite() incorrectly rewrites queries

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.1, 4.0-ALPHA
    • Component/s: core/search
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      This was reported on the userlist, in the context of range queries.

      Its also easy to make our existing tests fail with my patch on LUCENE-2751:

      ant test-core -Dtestcase=TestBoolean2 -Dtestmethod=testRandomQueries -Dtests.seed=7679849347282878725:-903778383189134045
      

      The fundamental problem is that MultiSearcher first rewrites against individual subs,
      then uses Query.combine() which simply OR's these sub-clauses.

      This is incorrect for expanded MUST_NOT queries (e.g. from wildcard), as it violates demorgan's law.

        Issue Links

          Activity

          Hide
          Grant Ingersoll added a comment -

          Bulk close for 3.1

          Show
          Grant Ingersoll added a comment - Bulk close for 3.1
          Hide
          Michael McCandless added a comment -

          MultiSearcher is now deprecated/removed.

          Show
          Michael McCandless added a comment - MultiSearcher is now deprecated/removed.
          Hide
          Robert Muir added a comment -

          attached is a simple test, it adds a single document "foo bar" to one index,
          and another document "foo baz" to another.

          if you do the query "+foo -ba*", the multisearcher rewrites this to:
          (+field:foo -field:baz) (+field:foo -field:bar)

          This causes both documents to match the query, when really neither should.
          instead the query should be (+field:foo -field:baz -field:bar)

          if you run the test with -Dtests.verbose=true you can see the rewritten form.

          the reason this only appeared with a certain document count for the issue on the
          user's list is because they were using CONSTANT_SCORE_AUTO and with that
          document count it was deciding to use a constant-score boolean rewrite method.

          Show
          Robert Muir added a comment - attached is a simple test, it adds a single document "foo bar" to one index, and another document "foo baz" to another. if you do the query "+foo -ba*", the multisearcher rewrites this to: (+field:foo -field:baz) (+field:foo -field:bar) This causes both documents to match the query, when really neither should. instead the query should be (+field:foo -field:baz -field:bar) if you run the test with -Dtests.verbose=true you can see the rewritten form. the reason this only appeared with a certain document count for the issue on the user's list is because they were using CONSTANT_SCORE_AUTO and with that document count it was deciding to use a constant-score boolean rewrite method.

            People

            • Assignee:
              Unassigned
              Reporter:
              Robert Muir
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development