Description
SequenceFile will optionally compress individual values. But both compression and performance would be much better if sequences of keys and values are compressed together. Sync marks should only be placed between blocks. This will require some changes to MapFile too, so that all file positions stored there are the positions of blocks, not entries within blocks. Probably this can be accomplished by adding a getBlockStartPosition() method to SequenceFile.Writer.
Attachments
Attachments
Issue Links
- is depended upon by
-
HADOOP-441 SequenceFile should support 'custom compressors'
- Closed