Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-5721

Monotonic packed could maybe be faster

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      This compression is used in lucene for monotonically increasing offsets, e.g. stored fields index, dv BINARY/SORTED_SET offsets, OrdinalMap (used for merging and faceting dv) and so on.

      Today this stores a +/- deviation from an expected line of y=mx + b, where b is the minValue for the block and m is the average delta from the previous value. Because it can be negative, we have to do some additional work to zigzag-decode.

      Can we just instead waste a bit for every value explicitly (lower the minValue by the min delta) so that deltas are always positive and we can have a simpler decode? Maybe If we do this, the new guy should assert that values are actually monotic at write-time. The current one supports "mostly monotic" but do we really need that flexibility anywhere? If so it could always be kept...

        Attachments

          Activity

            People

            • Assignee:
              jpountz Adrien Grand
              Reporter:
              rcmuir Robert Muir
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: