Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-9025

Add more efficient lookupTerm() overload to SortedSetDocValues

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Duplicate
    • Affects Version/s: master (9.0)
    • Fix Version/s: None
    • Component/s: core/search
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      SortedSetDocValues.lookupTerm(BytesRef) performs a binary search of the entire docValues range to find the ordinal of the requested BytesRef.

      For an individual invocation, this is optimal. Without other context, binary search needs to cover the entire space.

      But there are some common uses of lookupTerm where this shouldn't be necessary. For example: making multiple lookupTerm calls to fetch the ordinals for each value in a sorted list of terms. lookupTerm will binary-search the whole space on each invocation, even though the caller knows that there's no point searching anything before the ordinal that came back from the previous lookupTerm call.

      I propose we add a SortedSetDocValues.lookupTerm overload which takes a lower-bound to start the binary search at: public long lookupTerm(BytesRef key, long lowerSearchBound) throws IOException This saves each binary-search a few iterations in usage scenarios like the one described above, which can conceivably add up.

        Attachments

        1. LUCENE-9025.patch
          1 kB
          Jason Gerlowski

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                gerlowskija Jason Gerlowski
              • Votes:
                1 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: