-
Type:
New Feature
-
Status: Resolved
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: None
-
Component/s: Compaction
-
Labels:None
-
Hadoop Flags:Reviewed
-
Release Note:
This is a simple implementation of date-based tiered compaction similar to Cassandra's for the following benefits:
1. Improve date-range-based scan by structuring store files in date-based tiered layout.
2. Reduce compaction overhead.
3. Improve TTL efficiency.
Perfect fit for the use cases that:
1. has mostly date-based date write and scan and a focus on the most recent data.
2. never or rarely deletes data.
Out-of-order writes are handled gracefully. Time range overlapping among store files is tolerated and the performance impact is minimized.
Configuration can be set at hbase-site.xml or overriden at per-table or per-column-famly level by hbase shell.
Design spec is at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing
Results in our production is at https://docs.google.com/document/d/1GqRtQZMMkTEWOijZc8UCTqhACNmdxBSjtAQSYIWsmGU/edit#
- is duplicated by
-
HBASE-14477 Compaction improvements: Date tiered compaction policy
-
- Resolved
-
- relates to
-
HBASE-15337 Document date tiered compaction in the book
-
- Resolved
-
-
HBASE-15339 Improve DateTieredCompactionPolicy
-
- Resolved
-
- links to