Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
Reviewed
-
Description
FIFO Compaction
Introduction
FIFO compaction policy selects only files which have all cells expired. The column family MUST have non-default TTL.
Essentially, FIFO compactor does only one job: collects expired store files. These are some applications which could benefit the most:
- Use it for very high volume raw data which has low TTL and which is the source of another data (after additional processing). Example: Raw time-series vs. time-based rollup aggregates and compacted time-series. We collect raw time-series and store them into CF with FIFO compaction policy, periodically we run task which creates rollup aggregates and compacts time-series, the original raw data can be discarded after that.
- Use it for data which can be kept entirely in a a block cache (RAM/SSD). Say we have local SSD (1TB) which we can use as a block cache. No need for compaction of a raw data at all.
Because we do not do any real compaction, we do not use CPU and IO (disk and network), we do not evict hot data from a block cache. The result: improved throughput and latency both write and read.
See: https://github.com/facebook/rocksdb/wiki/FIFO-compaction-style
To enable FIFO compaction policy
For table:
HTableDescriptor desc = new HTableDescriptor(tableName); desc.setConfiguration(DefaultStoreEngine.DEFAULT_COMPACTION_POLICY_CLASS_KEY, FIFOCompactionPolicy.class.getName());
For CF:
HColumnDescriptor desc = new HColumnDescriptor(family); desc.setConfiguration(DefaultStoreEngine.DEFAULT_COMPACTION_POLICY_CLASS_KEY, FIFOCompactionPolicy.class.getName());
From HBase shell:
create 'x',{NAME=>'y', TTL=>'30'}, {CONFIGURATION => {'hbase.hstore.defaultengine.compactionpolicy.class' => 'org.apache.hadoop.hbase.regionserver.compactions.FIFOCompactionPolicy', 'hbase.hstore.blockingStoreFiles' => 1000}}
Although region splitting is supported, for optimal performance it should be disabled, either by setting explicitly DisabledRegionSplitPolicy or by setting ConstantSizeRegionSplitPolicy and very large max region size. You will have to increase to a very large number store's blocking file number : hbase.hstore.blockingStoreFiles as well (there is a sanity check on table/column family configuration in case of FIFO compaction and minimum value for number of blocking file is 1000).
Limitations
Do not use FIFO compaction if :
- Table/CF has MIN_VERSION > 0
- Table/CF has TTL = FOREVER (HColumnDescriptor.DEFAULT_TTL)
Attachments
Attachments
Issue Links
- is blocked by
-
HBASE-14467 Compaction improvements: DefaultCompactor should not compact TTL-expired files
- Closed
- is part of
-
HBASE-14383 Compaction improvements
- Closed
- relates to
-
HBASE-14847 Add FIFO compaction section to HBase book
- Closed
- requires
-
HBASE-14511 StoreFile.Writer Meta Plugin
- Closed
1.
|
Implement StoreFile Writer plugin (HBASE-14511) | Closed | Unassigned |