Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: core/search
    • Labels:
      None

      Description

      I'm on a mission to demonstrate (and then hopefully fix) any inconsistencies between the score you get for a doc when executing a search, and the score you get when asking for an explanation of the query for that doc.

      1. LUCENE-557-BooleanQuery-explain-fix.patch
        4 kB
        Hoss Man
      2. LUCENE-557-FilteredQuery-explain-fix.patch
        1 kB
        Hoss Man
      3. LUCENE-557-modify-existing-tests.patch
        45 kB
        Hoss Man
      4. LUCENE-557-modify-existing-tests.patch
        44 kB
        Hoss Man
      5. LUCENE-557-newtests.zip
        10 kB
        Hoss Man
      6. LUCENE-557-newtests.zip
        9 kB
        Hoss Man
      7. LUCENE-557-SpanScorer-explain-HACK-fix.patch
        1 kB
        Hoss Man

        Issue Links

          Activity

          Hide
          hossman Hoss Man added a comment -

          Phase one: stealthily modify (almost) all tests that use an IndexSearcher to use a new subclass which will check every matching doc/score in every search against the value from an explanation for that doc.

          Show
          hossman Hoss Man added a comment - Phase one: stealthily modify (almost) all tests that use an IndexSearcher to use a new subclass which will check every matching doc/score in every search against the value from an explanation for that doc.
          Hide
          hossman Hoss Man added a comment -

          In my haste to upload the testing patch before i left work, I faied to mention that it exposes 9 test failures, suggesting at least two bugs: in BooleanQuery and SpanNearQuery

          TestSpans.testSpanNearOrdered02
          TestSpans.testSpanNearOrdered03
          TestSpans.testSpanNearOrdered04
          TestSpans.testSpanNearOrdered05
          TestSpans.testSpanNearOrderedEqual02
          TestSpans.testSpanNearOrderedEqual03
          TestSpans.testSpanNearOrderedEqual04
          TestBoolean2.testRandomQueries
          TestBooleanMinShouldMatch.testRandomQueries

          Show
          hossman Hoss Man added a comment - In my haste to upload the testing patch before i left work, I faied to mention that it exposes 9 test failures, suggesting at least two bugs: in BooleanQuery and SpanNearQuery TestSpans.testSpanNearOrdered02 TestSpans.testSpanNearOrdered03 TestSpans.testSpanNearOrdered04 TestSpans.testSpanNearOrdered05 TestSpans.testSpanNearOrderedEqual02 TestSpans.testSpanNearOrderedEqual03 TestSpans.testSpanNearOrderedEqual04 TestBoolean2.testRandomQueries TestBooleanMinShouldMatch.testRandomQueries
          Hide
          hossman Hoss Man added a comment -

          Update to previous patch, with some additional helper utilities in CheckHits

          Show
          hossman Hoss Man added a comment - Update to previous patch, with some additional helper utilities in CheckHits
          Hide
          hossman Hoss Man added a comment -

          Some new tests covering every type of query in the "core" lucene code base. various examples of each query type are checked both that explanations for "matching" docs have the correct value, and that "non-matches" have an explain vlaue of 0.0.

          some of these tests may not be considered "fair" since the use boosts of 0.0 (and cause scores of NaN) but I was trying to be as complete as possible. As is there are many seemingly legitimate queries in here whose explanations are totally bogus.

          (NOTE: classes in Zip depend on previously attached PATCH)

          Show
          hossman Hoss Man added a comment - Some new tests covering every type of query in the "core" lucene code base. various examples of each query type are checked both that explanations for "matching" docs have the correct value, and that "non-matches" have an explain vlaue of 0.0. some of these tests may not be considered "fair" since the use boosts of 0.0 (and cause scores of NaN) but I was trying to be as complete as possible. As is there are many seemingly legitimate queries in here whose explanations are totally bogus. (NOTE: classes in Zip depend on previously attached PATCH)
          Hide
          hossman Hoss Man added a comment -

          Silly Hoss ... I made a big blunder when i created these tests, and didn't check that my "expected" cases where right for the basics of matching (let alone the explanations)

          Revised tests have fewer real tests, because it turns out "xx^0.0" doesn't do what i thought.

          Show
          hossman Hoss Man added a comment - Silly Hoss ... I made a big blunder when i created these tests, and didn't check that my "expected" cases where right for the basics of matching (let alone the explanations) Revised tests have fewer real tests, because it turns out "xx^0.0" doesn't do what i thought.
          Hide
          hossman Hoss Man added a comment -

          Some fixes to BooleanQuery to make BooleanWeight.explain work correctly in more cases: specifically when minNrShouldMatch is non zero, nd when there are required or prohibited clauses. In general, the Explanation now contains more information in the non-matching situations.

          This patch does not fix situations where a clause of the BooleanQuery matches on a document, but does so with a score of 0.0 (ie: a sub query with a boost of 0)

          This patch has no dependencies on any other patches.

          Show
          hossman Hoss Man added a comment - Some fixes to BooleanQuery to make BooleanWeight.explain work correctly in more cases: specifically when minNrShouldMatch is non zero, nd when there are required or prohibited clauses. In general, the Explanation now contains more information in the non-matching situations. This patch does not fix situations where a clause of the BooleanQuery matches on a document, but does so with a score of 0.0 (ie: a sub query with a boost of 0) This patch has no dependencies on any other patches.
          Hide
          hossman Hoss Man added a comment -

          Patch of FilteredQuery that returns a correct Explanation in negative cases where the document does not pass the filter.

          This patch does not depend on any other patches.

          Show
          hossman Hoss Man added a comment - Patch of FilteredQuery that returns a correct Explanation in negative cases where the document does not pass the filter. This patch does not depend on any other patches.
          Hide
          hossman Hoss Man added a comment -

          HACK work arround for making SpanScorer.explain work in spite of NearSpan bug. Fixes all existing known SpanQuery explain bugs.

          This patch has no dependencies on any other patches.

          Show
          hossman Hoss Man added a comment - HACK work arround for making SpanScorer.explain work in spite of NearSpan bug. Fixes all existing known SpanQuery explain bugs. This patch has no dependencies on any other patches.
          Hide
          hossman Hoss Man added a comment -

          Status Update:

          The patches so far fix all of the known issues that don't involve a Scorer "matching" a document with a score of 0.0 ... the SpanScorer patch isn't pretty ... but it may be considered acceptable depending on the scope of LUCENE-569. If LUCENE-569 does get fixed, no changes to SpanScorer for explain should be needed.

          Recent mailing list discussion on the merits of changing the Explanation API to deal with the remaining cases...

          http://www.nabble.com/BooleanWeight.normalize%28float%29-doesn%27t-normalize-prohibited-clauses--t1596471.html#a4347644

          I'm going to let all of this sit for a bit and revisit it later.

          Show
          hossman Hoss Man added a comment - Status Update: The patches so far fix all of the known issues that don't involve a Scorer "matching" a document with a score of 0.0 ... the SpanScorer patch isn't pretty ... but it may be considered acceptable depending on the scope of LUCENE-569 . If LUCENE-569 does get fixed, no changes to SpanScorer for explain should be needed. Recent mailing list discussion on the merits of changing the Explanation API to deal with the remaining cases... http://www.nabble.com/BooleanWeight.normalize%28float%29-doesn%27t-normalize-prohibited-clauses--t1596471.html#a4347644 I'm going to let all of this sit for a bit and revisit it later.
          Hide
          paul.elschot@xs4all.nl Paul Elschot added a comment -

          See also LUCENE-451.

          Show
          paul.elschot@xs4all.nl Paul Elschot added a comment - See also LUCENE-451 .
          Hide
          hossman Hoss Man added a comment -

          Based on my gut feelings and some limited feedback from the list, i've commited the additions to CheckHits, the patches for BooleanQuery and FilteredQuery attached to this bug, and all of the attached test cases that pass with those changes.

          I will attach the remaning tests that expect BooleanQuery to do the right thing when a clause has a match with a score <= 0.0 to LUCENE-451, and I'll attach the one failing SpanNear test to LUCENE-569

          I will not commit the hack fix for SpanScorer.explain, or the one line change to every existing test case that usees a searcher.

          Show
          hossman Hoss Man added a comment - Based on my gut feelings and some limited feedback from the list, i've commited the additions to CheckHits, the patches for BooleanQuery and FilteredQuery attached to this bug, and all of the attached test cases that pass with those changes. I will attach the remaning tests that expect BooleanQuery to do the right thing when a clause has a match with a score <= 0.0 to LUCENE-451 , and I'll attach the one failing SpanNear test to LUCENE-569 I will not commit the hack fix for SpanScorer.explain, or the one line change to every existing test case that usees a searcher.

            People

            • Assignee:
              hossman Hoss Man
              Reporter:
              hossman Hoss Man
            • Votes:
              1 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development