Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Won't Fix
-
2.2, 2.3, 2.3.1, 2.4
-
None
-
any
-
New, Patch Available
Description
The current implementation of the "Hits" class sometimes performs score normalization.
In particular, whenever the top-ranked score is bigger than 1.0, it is normalized to a maximum of 1.0.
In this case, Hits may return different score results than TopDocs-based methods.
In my scenario (a federated search system), Hits delievered just plain wrong results.
I was merging results from several sources, all having homogeneous statistics (similar to MultiSearcher, but over the Internet using HTTP/XML-based protocols).
Sometimes, some of the sources had a top-score greater than 1, so I ended up with garbled results.
I suggest to add a switch to enable/disable this score-normalization at runtime.
My patch (attached) has an additional peformance benefit, since score normalization now occurs only when Hits#score() is called, not when creating the Hits result list. Whenever scores are not required, you save one multiplication per retrieved hit (i.e., at least 100 multiplications with the current implementation of Hits).