-
Type:
Improvement
-
Status: Closed
-
Priority:
Minor
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: 5.0
-
Component/s: None
-
Labels:None
-
Lucene Fields:New
Robert pointed me to this paper: http://arxiv.org/pdf/1402.6407v4 that describes an interesting way to build doc id sets: The bit space is divided into blocks of 2^16 bits so that you can store the bits which are set either in a short[] (2 bytes per doc ID) or in a FixedBitSet. The choice is easy, if less than 2^12 bits are set, then the short[] representation is more compact otherwise a FixedBitSet would be more compact. It's quite similar to the way that Solr builds DocSets in SolrIndexSearcher.getDocSet(DocsEnumState).