Details
-
Test
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
Description
On the dev mailing list https://lists.apache.org/thread/113d1yzty5ryvyt2o9msfytldv41qpgq thread 4nn4r shared about the discovery of the org.apache.solr.ltr.LTRRescorer#scoreSingleHit code block and how hitUpto >= topN never arises. I agree that the condition currently never evaluates to true due to how the rescore method is called:
- The Solr ReRankCollector caps the number of documents that are passed to the rescore method: https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.11.1/solr/core/src/java/org/apache/solr/search/ReRankCollector.java#L112-L119
- The Lucene Rescorer API however describes topN as the number of hits to return i.e. it is possible for firstPassTopDocs to contain more than topN documents: https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.11.1/lucene/core/src/java/org/apache/lucene/search/Rescorer.java#L51
This ticket here proposes to expand LTRRescorer.rescore test coverage to include the "more than topN documents are passed to be rescored" scenario.
(Whether or not the block of code in question could/should be removed I'd like to leave out of the scope of this current ticket here since on a high level not supporting topN != firstPassTopDocs.scoreDocs.length in LTRRescorer could simplify its code but on a practical level (at least theoretically) backwards compatibility would also need consideration and it's possible that some custom ReRankCollector (which we don't know of) does for some reason not cap the number of documents passed in the way ReRankCollector does.)
Attachments
Issue Links
- relates to
-
SOLR-15873 simplify LTR[Interleaving]Rescorer code
- Open
- links to