Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Duplicate
-
9.0
-
None
-
None
-
New, Patch Available
Description
SortedSetDocValues.lookupTerm(BytesRef) performs a binary search of the entire docValues range to find the ordinal of the requested BytesRef.
For an individual invocation, this is optimal. Without other context, binary search needs to cover the entire space.
But there are some common uses of lookupTerm where this shouldn't be necessary. For example: making multiple lookupTerm calls to fetch the ordinals for each value in a sorted list of terms. lookupTerm will binary-search the whole space on each invocation, even though the caller knows that there's no point searching anything before the ordinal that came back from the previous lookupTerm call.
I propose we add a SortedSetDocValues.lookupTerm overload which takes a lower-bound to start the binary search at: public long lookupTerm(BytesRef key, long lowerSearchBound) throws IOException This saves each binary-search a few iterations in usage scenarios like the one described above, which can conceivably add up.
Attachments
Attachments
Issue Links
- duplicates
-
LUCENE-8836 Optimize DocValues TermsDict to continue scanning from the last position when possible
- Closed