Affects Version/s: None
Fix Version/s: None
Operating System: other
Two files are attached that might form the basis for an alternative
filter implementation that is more memory efficient than one bit
per doc when less than about 1/8 of the docs pass through the filter.
The document numbers are stored in RAM as VInt's from the Lucene index
format. These VInt's encode the difference between two successive
document numbers, much like a PositionDelta in the Positions:
The getByteSize() method can be used to verify the compression
once a SortedVIntList is constructed.
The precise conditions under which this is more memory efficient than
one bit per document are not easy to specify in advance.