Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-6697

Use 1D KD tree for alternative to postings based numeric range filters

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.3, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Today Lucene uses postings to index a numeric value at multiple
      precision levels for fast range searching. It's somewhat costly: each
      numeric value is indexed with multiple terms (4 terms by default)
      ... I think a dedicated 1D BKD tree should be more compact and perform
      better.

      It should also easily generalize beyond 64 bits to arbitrary byte[],
      e.g. for LUCENE-5596, but I haven't explored that here.

      A 1D BKD tree just sorts all values, and then indexes adjacent leaf
      blocks of size 512-1024 (by default) values per block, and their
      docIDs, into a fully balanced binary tree. Building the range filter
      is then just a recursive walk through this tree.

      It's the same structure we use for 2D lat/lon BKD tree, just with 1D
      instead. I implemented it as a DocValuesFormat that also writes the
      numeric tree on the side.

        Attachments

        1. LUCENE-6697.patch
          95 kB
          Michael McCandless
        2. LUCENE-6697.patch
          119 kB
          Michael McCandless
        3. LUCENE-6697.patch
          116 kB
          Michael McCandless

          Activity

            People

            • Assignee:
              mikemccand Michael McCandless
              Reporter:
              mikemccand Michael McCandless
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: