Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-9113

Speed up merging doc values terms dictionaries

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 8.5
    • None
    • None
    • New

    Description

      The default DocValuesConsumer#mergeSortedField and DocValuesConsumer#mergeSortedSetField implementations create a merged view of the doc values producers to merge. Unfortunately, it doesn't override termsEnum(), whose default implementation of next() increments the ordinal and calls lookupOrd() to retrieve the term. Currently, lookupOrd() doesn't take advantage of its current position, and would seek to the block start and then call next() up to 16 times to go to the desired term. While there are discussions to optimize lookups to take advantage of the current ord (LUCENE-8836), it shouldn't be required for merging to be efficient and we should instead make next() call next() on its sub enums.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              jpountz Adrien Grand
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h