Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
New
Description
For Sorted/SortedSet types, we encode ordinals and a term dictionary (similar to old lucene 3 term dictionary).
Originally we had no prefix compression, so we "save space" in the fixed-width case by avoiding addressing, we can just use multiplication: https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/codecs/lucene54/Lucene54DocValuesConsumer.java#L423-L425
But it means no compression whatsoever of the actual bytes, even if values are enormous, I don't think its necessarily a good tradeoff. The lack of prefix compression can become much more magnified now that we have fixed width 128-bit point types in the sandbox...