Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-28463

Time Based Priority for BucketCache

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • None
    • None
    • BucketCache
    • None

    Description

      This Jira introduces the feature of time-based data tiering in HBase to optimize storage efficiency and access performance by segregating data based on its recency. By keeping recent data in the bucket cache (backed by faster storage types like SSDs) and evicting older data, the system aims to provide a more flexible control over the cache allocation and eviction logic via configuration, allowing for defining time priorities for cached data. 

      The need for a more extensive cache allocation mechanism becomes even more critical on HBase deployments where cache access reflects on significant performance gains, such as when using cloud storage as the underlying file system.

      The data is segregated into hot or cold categories based on its age. The recent data within a specific time range (configured as hot-data-age) is treated as hot and is stored in the cache, while the older data is stored and accessed from the file system.

      This feature intends to provide the TCO gains by optimizing the utilization of high cost bucket cache. Perfect fit for the use cases that have the date-based data writes while the scans focus on the recently written data.

      Please find the detailed design document of the feature attached with the Jira.

      Thanks,

      Janardhan

      Attachments

        Issue Links

          Activity

            People

              janardhan.hungund Janardhan Hungund
              janardhan.hungund Janardhan Hungund
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: