Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Later
-
None
-
None
-
None
-
None
-
New
Description
This change works on the new bulk branch.
Currently, our BlockPostingsFormat only supports skipInterval==blockSize. Every time the skipper reaches the last level 0 skip point, we'll have to decode a whole block to read doc/freq data. Also, a higher level skip list will be created only for those df>blockSize^k, which means for most terms, skipping will just be a linear scan. If we increase current blockSize for better bulk i/o performance, current skip setting will be a bottleneck.
For ForPF, the encoded block can be easily splitted if we set skipInterval=32*k.
Attachments
Attachments
Issue Links
- is related to
-
LUCENE-3892 Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
-
- Closed
-
- relates to
-
LUCENE-4225 New FixedPostingsFormat for less overhead than SepPostingsFormat
-
- Resolved
-