XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 5.0
    • None
    • None
    • New

    Description

      Robert pointed me to this paper: http://arxiv.org/pdf/1402.6407v4 that describes an interesting way to build doc id sets: The bit space is divided into blocks of 2^16 bits so that you can store the bits which are set either in a short[] (2 bytes per doc ID) or in a FixedBitSet. The choice is easy, if less than 2^12 bits are set, then the short[] representation is more compact otherwise a FixedBitSet would be more compact. It's quite similar to the way that Solr builds DocSets in SolrIndexSearcher.getDocSet(DocsEnumState).

      Attachments

        1. LUCENE-5983.patch
          24 kB
          Adrien Grand

        Activity

          People

            jpountz Adrien Grand
            jpountz Adrien Grand
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: