Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
Reviewed
-
Description
This is a simple implementation of date-based tiered compaction similar to Cassandra's for the following benefits:
1. Improve date-range-based scan by structuring store files in date-based tiered layout.
2. Reduce compaction overhead.
3. Improve TTL efficiency.
Perfect fit for the use cases that:
1. has mostly date-based date write and scan and a focus on the most recent data.
2. never or rarely deletes data.
Out-of-order writes are handled gracefully. Time range overlapping among store files is tolerated and the performance impact is minimized.
Configuration can be set at hbase-site.xml or overriden at per-table or per-column-famly level by hbase shell.
Design spec is at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing
Results in our production is at https://docs.google.com/document/d/1GqRtQZMMkTEWOijZc8UCTqhACNmdxBSjtAQSYIWsmGU/edit#
Attachments
Attachments
Issue Links
- is duplicated by
-
HBASE-14477 Compaction improvements: Date tiered compaction policy
- Closed
- relates to
-
HBASE-15337 Document date tiered compaction in the book
- Closed
-
HBASE-15339 Improve DateTieredCompactionPolicy
- Closed
- links to