Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-328

Some utilities for a compact sparse filter

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Duplicate
    • None
    • None
    • core/search
    • None
    • Operating System: other
      Platform: Other

    • 32921

    Description

      Two files are attached that might form the basis for an alternative
      filter implementation that is more memory efficient than one bit
      per doc when less than about 1/8 of the docs pass through the filter.

      The document numbers are stored in RAM as VInt's from the Lucene index
      format. These VInt's encode the difference between two successive
      document numbers, much like a PositionDelta in the Positions:
      http://jakarta.apache.org/lucene/docs/fileformats.html

      The getByteSize() method can be used to verify the compression
      once a SortedVIntList is constructed.
      The precise conditions under which this is more memory efficient than
      one bit per document are not easy to specify in advance.

      Attachments

        1. SkipFilter1.patch
          4 kB
          Paul Elschot
        2. IntArraySortedIntList.java
          3 kB
          Mark Harwood
        3. ASF.LICENSE.NOT.GRANTED--TestDocNrSkippers.java
          6 kB
          Mark Harwood
        4. ASF.LICENSE.NOT.GRANTED--OrDocNrSkipper.java
          2 kB
          Mark Harwood
        5. ASF.LICENSE.NOT.GRANTED--AndDocNrSkipper.java
          2 kB
          Mark Harwood
        6. ASF.LICENSE.NOT.GRANTED--TestDocNrSkippers.java
          6 kB
          Mark Harwood
        7. ASF.LICENSE.NOT.GRANTED--AndDocNrSkipper.java
          2 kB
          Mark Harwood
        8. ASF.LICENSE.NOT.GRANTED--OrDocNrSkipper.java
          2 kB
          Mark Harwood
        9. ASF.LICENSE.NOT.GRANTED--BitSetSortedIntList.java
          1 kB
          Mark Harwood
        10. ASF.LICENSE.NOT.GRANTED--IntArraySortedIntList.java
          3 kB
          Mark Harwood
        11. ASF.LICENSE.NOT.GRANTED--TestSortedVIntList.java
          4 kB
          Paul Elschot
        12. ASF.LICENSE.NOT.GRANTED--SortedVIntList.java
          4 kB
          Paul Elschot
        13. ASF.LICENSE.NOT.GRANTED--DocNrSkipper.java
          1 kB
          Paul Elschot
        14. ASF.LICENSE.NOT.GRANTED--TestSortedVIntList.java
          4 kB
          Paul Elschot
        15. ASF.LICENSE.NOT.GRANTED--SortedVIntList.java
          4 kB
          Paul Elschot
        16. ASF.LICENSE.NOT.GRANTED--DocNrSkipper.java
          1 kB
          Paul Elschot
        17. ASF.LICENSE.NOT.GRANTED--TestSortedVIntList.java
          4 kB
          Paul Elschot
        18. ASF.LICENSE.NOT.GRANTED--SortedVIntList.java
          4 kB
          Paul Elschot

        Issue Links

          Activity

            People

              Unassigned Unassigned
              paul.elschot@xs4all.nl Paul Elschot
              Votes:
              5 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: