HBase
  1. HBase
  2. HBASE-6093

Flatten timestamps during flush and compaction

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Incomplete
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: io, Performance, regionserver
    • Labels:
      None

      Description

      Many applications run with maxVersions=1 and do not care about timestamps, or they will specify one timestamp per row as a normal KeyValue rather than per-cell.

      Then, DataBlockEncoders like those in HBASE-4218 and HBASE-4676 often encode timestamps as diffs from the previous or diffs from the minimum timestamp in the block. If all timestamps in a block are the same, they will all compress to basically <= 8 bytes total per block. This can be 10% to 25% space savings for some schemas, and that savings is realized both on disk and in block cache.

      We could add a ColumnFamily setting flattenTimestamps=[true/false]. If true, then all timestamps are modified during a flush/compaction to the currentTimeMillis() at the start of the flush/compaction. If all timestamps are made identical in a file, then the encoder will be able to eliminate them.

      The simplest use case is probably that where all inserts are type=Put, there are no overwrites, and there are no deletes. As use cases get more complex, then so does the implementation.

      For example, what happens when there is a Put and a Delete of the same cell in the same memstore? Maybe for a flush at t=flushStartTime, the Put gets timestamp=t, and the Delete gets timestamp=t+1. Or maybe HBASE-4241 could take care of this problem.

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            Unassigned
            Reporter:
            Matt Corgan
          • Votes:
            2 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development