Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-6901

Optimize 1D dimensional value indexing

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Dimensional values give a smaller index, and faster search times, for indexing ordered byte[] values across one or more dimensions, vs our existing approaches, but the indexing time is substantially slower.

      Since the 1D case is so important/common (numeric fields, range query) I think it's worth optimizing its indexing time. It should also be possible to optimize the N > 1 dimensions case too, but it's more complex ... we can postpone that.

      So for the 1D case, I changed the merge method to do a merge sort (like postings) of the already sorted segments dimensional values, instead of simply re-indexing all values from the incoming segments, and this was a big speedup.

      I also changed from InPlaceMergeSorter to IntroSorter (this is what postings use, and it's faster but still safe) and this was another good speedup, which should also help the > 1D cases.

      Finally, I added a BKDReader.verify method (currently it's dark: NOT called) that walks the index and then check that every value in each leaf block does in fact fall within what the index expected/claimed. This is useful for finding bugs! Maybe we can cleanly fold it into CheckIndex somehow later.

        Attachments

        1. LUCENE-6901.patch
          55 kB
          Michael McCandless
        2. LUCENE-6901-timsort.patch
          58 kB
          Michael McCandless
        3. LUCENE-6901.patch
          56 kB
          Michael McCandless

          Activity

            People

            • Assignee:
              mikemccand Michael McCandless
              Reporter:
              mikemccand Michael McCandless
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: