Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Optimization of the compression mechanisms used by RC File to be explored.
Some initial ideas
1. More efficient serialization/deserialization based on type-specific and storage-specific knowledge.
For instance, storing sorted numeric values efficiently using some delta coding techniques
2. More efficient compression based on type-specific and storage-specific knowledge
Enable compression codecs to be specified based on types or individual columns
3. Reordering the on-disk storage for better compression efficiency.
Attachments
Attachments
Issue Links
- is related to
-
HIVE-352 Make Hive support column based storage
- Closed
1.
|
Add UberCompressor Serde/Codec to contrib which allows per-column compression strategies | Patch Available | Krishna Kumar | |
2.
|
Add Integer type compressors | Patch Available | Krishna Kumar | |
3.
|
Add compressors for qualitative data types | Open | Krishna Kumar |