Details
-
New Feature
-
Status: In Progress
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
This Jira introduces the feature of time-based data tiering in HBase to optimize storage efficiency and access performance by segregating data based on its recency. By keeping recent data in the bucket cache (backed by faster storage types like SSDs) and evicting older data, the system aims to provide a more flexible control over the cache allocation and eviction logic via configuration, allowing for defining time priorities for cached data.
The need for a more extensive cache allocation mechanism becomes even more critical on HBase deployments where cache access reflects on significant performance gains, such as when using cloud storage as the underlying file system.
The data is segregated into hot or cold categories based on its age. The recent data within a specific time range (configured as hot-data-age) is treated as hot and is stored in the cache, while the older data is stored and accessed from the file system.
This feature intends to provide the TCO gains by optimizing the utilization of high cost bucket cache. Perfect fit for the use cases that have the date-based data writes while the scans focus on the recently written data.
Please find the detailed design document of the feature attached with the Jira.
Thanks,
Janardhan
Attachments
Issue Links
- is a parent of
-
HBASE-28542 Refactoring Data Tiering Management for Improved Extensibility and Maintainability
- Open
-
HBASE-28465 Implementation of framework for time-based priority bucket-cache.
- Resolved
-
HBASE-28466 Integration of time-based priority logic of bucket cache in prefetch functionality of HBase.
- Resolved
-
HBASE-28467 Integration of time-based priority caching into cacheOnRead read code paths.
- Resolved
-
HBASE-28468 Integration of time-based priority caching logic into cache evictions.
- Resolved
-
HBASE-28469 Integration of time-based priority caching into compaction paths.
- Resolved
-
HBASE-28527 Adjust BlockCacheKey to use the file path instead of file name.
- Resolved
-
HBASE-28535 Implement a region server level configuration to enable/disable data-tiering
- Resolved
- links to