Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-9113

Speed up merging doc values terms dictionaries

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 8.5
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      The default DocValuesConsumer#mergeSortedField and DocValuesConsumer#mergeSortedSetField implementations create a merged view of the doc values producers to merge. Unfortunately, it doesn't override termsEnum(), whose default implementation of next() increments the ordinal and calls lookupOrd() to retrieve the term. Currently, lookupOrd() doesn't take advantage of its current position, and would seek to the block start and then call next() up to 16 times to go to the desired term. While there are discussions to optimize lookups to take advantage of the current ord (LUCENE-8836), it shouldn't be required for merging to be efficient and we should instead make next() call next() on its sub enums.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                jpountz Adrien Grand
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h