Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-14477

Compaction improvements: Date tiered compaction policy

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • 2.0.0
    • None
    • None

    Description

      For immutable and mostly immutable data the current SizeTiered-based compaction policy is not efficient.

      1. There is no need to compact all files into one, because, data is (mostly) immutable and we do not need to collect garbage. (performance reason will be discussed later)
      2. Size-tiered compaction is not suitable for applications where most recent data is most important and prevents efficient caching of this data.

      The idea is pretty similar to DateTieredCompaction in Cassandra:

      http://www.datastax.com/dev/blog/datetieredcompactionstrategy
      http://www.datastax.com/dev/blog/dtcs-notes-from-the-field

      From Cassandra own blog:

      Since DTCS can be used with any table, it is important to know when it is a good idea, and when it is not. I’ll try to explain the spectrum and trade-offs here:

      1. Perfect Fit: Time Series Fact Data, Deletes by Default TTL: When you ingest fact data that is ordered in time, with no deletes or overwrites. This is the standard “time series” use case.

      2. OK Fit: Time-Ordered, with limited updates across whole data set, or only updates to recent data: When you ingest data that is (mostly) ordered in time, but revise or delete a very small proportion of the overall data across the whole timeline.

      3. Not a Good Fit: many partial row updates or deletions over time: When you need to partially revise or delete fields for rows that you read together. Also, when you revise or delete rows within clustered reads.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              vrodionov Vladimir Rodionov
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: