Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-6697

Use 1D KD tree for alternative to postings based numeric range filters

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 5.3, 6.0
    • None
    • None
    • New

    Description

      Today Lucene uses postings to index a numeric value at multiple
      precision levels for fast range searching. It's somewhat costly: each
      numeric value is indexed with multiple terms (4 terms by default)
      ... I think a dedicated 1D BKD tree should be more compact and perform
      better.

      It should also easily generalize beyond 64 bits to arbitrary byte[],
      e.g. for LUCENE-5596, but I haven't explored that here.

      A 1D BKD tree just sorts all values, and then indexes adjacent leaf
      blocks of size 512-1024 (by default) values per block, and their
      docIDs, into a fully balanced binary tree. Building the range filter
      is then just a recursive walk through this tree.

      It's the same structure we use for 2D lat/lon BKD tree, just with 1D
      instead. I implemented it as a DocValuesFormat that also writes the
      numeric tree on the side.

      Attachments

        1. LUCENE-6697.patch
          116 kB
          Michael McCandless
        2. LUCENE-6697.patch
          119 kB
          Michael McCandless
        3. LUCENE-6697.patch
          95 kB
          Michael McCandless

        Activity

          People

            mikemccand Michael McCandless
            mikemccand Michael McCandless
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: