Uploaded image for project: 'Lucy'
  1. Lucy
  2. LUCY-180

ORQuery, ANDQuery, RequiredOptionalQuery optimizations affect scoring

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.1.0 (incubating), 0.2.0 (incubating), 0.2.1 (incubating)
    • None
    • None

    Description

      ORQuery, ANDQuery, and RequiredOptionalQuery all have optimizations which kick
      in when only one child Query can match: they all compile down to the inner
      Matcher.

      In the case of ORQuery and RequiredOptionalQuery, this optimization can kick
      in per-segment, resulting in an ORMatcher/RequiredOptionalMatcher for some
      segments and e.g. a child TermMatcher for others. This skews scoring because
      coord() affects the ORMatcher/RequiredOptionalMatcher, but not the TermMatcher
      – the ORMatcher/RequiredOptionalMatcher damps the score of the matching term
      by a coord() multiplier which is typically less than 1.0, but the TermMatcher
      contributes 100% of its score. The punchline is that two documents in
      different segments which present identical match criteria can produce
      different scores, depending on whether terms not present in the document are
      represented in the segment.

      In addition, ORQuery may compile down to a smaller ORMatcher when
      e.g. 3 out of 5 OR'd terms are present. This skews scoring for similar
      reasons.

      To present consistent scoring across all segments, Queries should always
      compile down to the same Matcher node structore for each segment. By the time
      you are compiling per-segment Matchers, it is too late to re-calculate the
      weighting, so you can't optimize the Matcher structure when you find that e.g.
      one of two terms doesn't exist in a given segment.

      In addition, when compiling down to a single child Matcher, ORQuery, ANDQuery
      and RequiredOptionalQuery all discard custom boosts. This is solvable by
      moving the optimization from Compiler_Make_Matcher() up into
      Query_Make_Compiler().

      Attachments

        1. LUCY-180.patch
          9 kB
          Marvin Humphrey
        2. LUCY-180-minimal.patch
          4 kB
          Marvin Humphrey

        Activity

          People

            marvin Marvin Humphrey
            marvin Marvin Humphrey
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: