Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-9025

Add more efficient lookupTerm() overload to SortedSetDocValues

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Duplicate
    • 9.0
    • None
    • core/search
    • None
    • New, Patch Available

    Description

      SortedSetDocValues.lookupTerm(BytesRef) performs a binary search of the entire docValues range to find the ordinal of the requested BytesRef.

      For an individual invocation, this is optimal. Without other context, binary search needs to cover the entire space.

      But there are some common uses of lookupTerm where this shouldn't be necessary. For example: making multiple lookupTerm calls to fetch the ordinals for each value in a sorted list of terms. lookupTerm will binary-search the whole space on each invocation, even though the caller knows that there's no point searching anything before the ordinal that came back from the previous lookupTerm call.

      I propose we add a SortedSetDocValues.lookupTerm overload which takes a lower-bound to start the binary search at: public long lookupTerm(BytesRef key, long lowerSearchBound) throws IOException This saves each binary-search a few iterations in usage scenarios like the one described above, which can conceivably add up.

      Attachments

        1. LUCENE-9025.patch
          1 kB
          Jason Gerlowski

        Issue Links

          Activity

            People

              Unassigned Unassigned
              gerlowskija Jason Gerlowski
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: