Issue Details (XML | Word | Printable)

Key: LUCENE-328
Type: Improvement Improvement
Status: Resolved Resolved
Resolution: Duplicate
Priority: Minor Minor
Assignee: Unassigned
Reporter: Paul Elschot
Votes: 5
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Lucene - Java

Some utilities for a compact sparse filter

Created: 04/Jan/05 12:38 AM   Updated: 28/Jun/06 01:38 AM
Return to search
Component/s: Search
Affects Version/s: CVS Nightly - Specify date in submission
Fix Version/s: None

Time Tracking:
Not Specified

File Attachments:
  Size
Java Source File AndDocNrSkipper.java 2005-06-21 07:52 PM Mark Harwood 2 kB
Java Source File AndDocNrSkipper.java 2005-06-21 06:21 PM Mark Harwood 2 kB
Java Source File BitSetSortedIntList.java 2005-06-21 06:20 PM Mark Harwood 1 kB
Java Source File DocNrSkipper.java 2005-02-09 04:43 AM Paul Elschot 1 kB
Java Source File DocNrSkipper.java 2005-01-17 02:00 AM Paul Elschot 1 kB
Java Source File Licensed for inclusion in ASF works IntArraySortedIntList.java 2005-11-22 07:07 AM Mark Harwood 3 kB
Java Source File IntArraySortedIntList.java 2005-06-21 06:19 PM Mark Harwood 3 kB
Java Source File OrDocNrSkipper.java 2005-06-21 07:53 PM Mark Harwood 2 kB
Java Source File OrDocNrSkipper.java 2005-06-21 06:20 PM Mark Harwood 2 kB
Text File Licensed for inclusion in ASF works SkipFilter1.patch 2006-05-15 11:17 PM Paul Elschot 4 kB
Java Source File SortedVIntList.java 2005-02-09 04:44 AM Paul Elschot 4 kB
Java Source File SortedVIntList.java 2005-01-17 02:04 AM Paul Elschot 4 kB
Java Source File SortedVIntList.java 2005-01-04 12:40 AM Paul Elschot 4 kB
Java Source File TestDocNrSkippers.java 2005-06-21 07:53 PM Mark Harwood 6 kB
Java Source File TestDocNrSkippers.java 2005-06-21 06:21 PM Mark Harwood 6 kB
Java Source File TestSortedVIntList.java 2005-02-09 04:46 AM Paul Elschot 4 kB
Java Source File TestSortedVIntList.java 2005-01-17 02:06 AM Paul Elschot 4 kB
Java Source File TestSortedVIntList.java 2005-01-06 03:21 AM Paul Elschot 4 kB
Environment:
Operating System: other
Platform: Other
Issue Links:
Reference
 

Bugzilla Id: 32921
Resolution Date: 28/Jun/06 01:38 AM


 Description  « Hide
Two files are attached that might form the basis for an alternative
filter implementation that is more memory efficient than one bit
per doc when less than about 1/8 of the docs pass through the filter.

The document numbers are stored in RAM as VInt's from the Lucene index
format. These VInt's encode the difference between two successive
document numbers, much like a PositionDelta in the Positions:
http://jakarta.apache.org/lucene/docs/fileformats.html

The getByteSize() method can be used to verify the compression
once a SortedVIntList is constructed.
The precise conditions under which this is more memory efficient than
one bit per document are not easy to specify in advance.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
No work has yet been logged on this issue.