Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-954

Toggle score normalization in Hits

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 2.2, 2.3, 2.3.1, 2.4
    • Fix Version/s: 2.4
    • Component/s: core/search
    • Labels:
      None
    • Environment:

      any

    • Lucene Fields:
      New, Patch Available

      Description

      The current implementation of the "Hits" class sometimes performs score normalization.
      In particular, whenever the top-ranked score is bigger than 1.0, it is normalized to a maximum of 1.0.

      In this case, Hits may return different score results than TopDocs-based methods.

      In my scenario (a federated search system), Hits delievered just plain wrong results.
      I was merging results from several sources, all having homogeneous statistics (similar to MultiSearcher, but over the Internet using HTTP/XML-based protocols).
      Sometimes, some of the sources had a top-score greater than 1, so I ended up with garbled results.

      I suggest to add a switch to enable/disable this score-normalization at runtime.
      My patch (attached) has an additional peformance benefit, since score normalization now occurs only when Hits#score() is called, not when creating the Hits result list. Whenever scores are not required, you save one multiplication per retrieved hit (i.e., at least 100 multiplications with the current implementation of Hits).

        Attachments

        1. LUCENE-954.patch
          7 kB
          Christian Kohlschütter
        2. hits-scoreNorm.patch
          3 kB
          Christian Kohlschütter

          Activity

            People

            • Assignee:
              otis Otis Gospodnetic
              Reporter:
              ck@newsclub.de Christian Kohlschütter
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: