Lucene - Core
  1. Lucene - Core
  2. LUCENE-4300

BooleanQuery inconsistently applies coord() if it rewrites itself

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0, 3.6.2, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Tripped by the new random sim from LUCENE-4297:

      The basics are this:

      • BooleanQuery has the following rewrite():
          public Query rewrite(IndexReader reader) throws IOException {
            if (minNrShouldMatch == 0 && clauses.size() == 1) {                    // optimize 1-clause queries
        
      • you have a coord() impl that doesnt return 1.0 if overlap == maxOverlap, particularly:
        return overlap / ((float)maxOverlap + 1);
        
      • TestBooleanMinShouldMatch.testRandomQueries generates random boolean queries (Q1), then compares the scores of the random query to the same query but with minNrShouldmatch applied to its should clauses (Q2)
      • in the case of a single term BQ, the rewrite applies to Q1, making it a term query, but not to Q2. so the coord() only gets called for Q2, not Q1. and with this crazy coord it means the scores are different.

      I think the rewrite is wrong, we should also rewrite single-query BQs where minNrShouldMatch = 1 and there is a single optional clause.

      1. LUCENE-4300.patch
        8 kB
        Robert Muir
      2. LUCENE-4300.patch
        2 kB
        Robert Muir
      3. LUCENE-4300.patch
        1 kB
        Robert Muir

        Activity

        Hide
        Robert Muir added a comment -

        here's a patch. Ill make a standalone test.

        Show
        Robert Muir added a comment - here's a patch. Ill make a standalone test.
        Hide
        Robert Muir added a comment -

        here's an alternative patch.

        After doing some tests, I think its too tricky to apply this coord factory consistently in BS2 when there might also be prohibited clauses: so wiring it to 1 this way when maxOverlap == 1 is easiest.

        Show
        Robert Muir added a comment - here's an alternative patch. After doing some tests, I think its too tricky to apply this coord factory consistently in BS2 when there might also be prohibited clauses: so wiring it to 1 this way when maxOverlap == 1 is easiest.
        Hide
        Robert Muir added a comment -

        updated patch: I beefed up the random tests and also added 2 standalone tests (one for minNrShouldMatch == 1, the other for prohibited clauses).

        Show
        Robert Muir added a comment - updated patch: I beefed up the random tests and also added 2 standalone tests (one for minNrShouldMatch == 1, the other for prohibited clauses).
        Hide
        Uwe Schindler added a comment -

        Closed after release.

        Show
        Uwe Schindler added a comment - Closed after release.

          People

          • Assignee:
            Unassigned
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development