Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-328

Some utilities for a compact sparse filter

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Duplicate
    • None
    • None
    • core/search
    • None
    • Operating System: other
      Platform: Other

    • 32921

    Description

      Two files are attached that might form the basis for an alternative
      filter implementation that is more memory efficient than one bit
      per doc when less than about 1/8 of the docs pass through the filter.

      The document numbers are stored in RAM as VInt's from the Lucene index
      format. These VInt's encode the difference between two successive
      document numbers, much like a PositionDelta in the Positions:
      http://jakarta.apache.org/lucene/docs/fileformats.html

      The getByteSize() method can be used to verify the compression
      once a SortedVIntList is constructed.
      The precise conditions under which this is more memory efficient than
      one bit per document are not easy to specify in advance.

      Attachments

        1. SkipFilter1.patch
          4 kB
          Paul Elschot
        2. IntArraySortedIntList.java
          3 kB
          Mark Harwood
        3. ASF.LICENSE.NOT.GRANTED--TestSortedVIntList.java
          4 kB
          Paul Elschot
        4. ASF.LICENSE.NOT.GRANTED--TestSortedVIntList.java
          4 kB
          Paul Elschot
        5. ASF.LICENSE.NOT.GRANTED--TestSortedVIntList.java
          4 kB
          Paul Elschot
        6. ASF.LICENSE.NOT.GRANTED--TestDocNrSkippers.java
          6 kB
          Mark Harwood
        7. ASF.LICENSE.NOT.GRANTED--TestDocNrSkippers.java
          6 kB
          Mark Harwood
        8. ASF.LICENSE.NOT.GRANTED--SortedVIntList.java
          4 kB
          Paul Elschot
        9. ASF.LICENSE.NOT.GRANTED--SortedVIntList.java
          4 kB
          Paul Elschot
        10. ASF.LICENSE.NOT.GRANTED--SortedVIntList.java
          4 kB
          Paul Elschot
        11. ASF.LICENSE.NOT.GRANTED--OrDocNrSkipper.java
          2 kB
          Mark Harwood
        12. ASF.LICENSE.NOT.GRANTED--OrDocNrSkipper.java
          2 kB
          Mark Harwood
        13. ASF.LICENSE.NOT.GRANTED--IntArraySortedIntList.java
          3 kB
          Mark Harwood
        14. ASF.LICENSE.NOT.GRANTED--DocNrSkipper.java
          1 kB
          Paul Elschot
        15. ASF.LICENSE.NOT.GRANTED--DocNrSkipper.java
          1 kB
          Paul Elschot
        16. ASF.LICENSE.NOT.GRANTED--BitSetSortedIntList.java
          1 kB
          Mark Harwood
        17. ASF.LICENSE.NOT.GRANTED--AndDocNrSkipper.java
          2 kB
          Mark Harwood
        18. ASF.LICENSE.NOT.GRANTED--AndDocNrSkipper.java
          2 kB
          Mark Harwood

        Issue Links

          Activity

            People

              Unassigned Unassigned
              paul.elschot@xs4all.nl Paul Elschot
              Votes:
              5 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: