[HBASE-14383] Compaction improvements - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

Still major issue in many production environments. The general recommendation - disabling region splitting and major compactions to reduce unpredictable IO/CPU spikes, especially during peak times and running them manually during off peak times. Still do not resolve the issues completely.

Flush storms

rolling WAL events across cluster can be highly correlated, hence flushing memstores, hence triggering minor compactions, that can be promoted to major ones. These events are highly correlated in time if there is a balanced write-load on the regions in a table.
the same is true for memstore flushing due to periodic memstore flusher operation.

Both above may produce flush storms which are as bad as compaction storms.

What can be done here. We can spread these events over time by randomizing (with jitter) several config options:

hbase.regionserver.optionalcacheflushinterval
hbase.regionserver.flush.per.changes
hbase.regionserver.maxlogs

ExploringCompactionPolicy max compaction size

One more optimization can be added to ExploringCompactionPolicy. To limit size of a compaction there is a config parameter one could use hbase.hstore.compaction.max.size. It would be nice to have two separate limits: for peak and off peak hours.

ExploringCompactionPolicy selection evaluation algorithm

Too simple? Selection with more files always wins, selection of smaller size wins if number of files is the same.

Attachments

Issue Links

incorporates

HBASE-14477 Compaction improvements: Date tiered compaction policy

Closed

HBASE-14496 Compaction improvements: Delayed compaction in RatioBasedCompactionPolicy

Closed

HBASE-14651 Default minimum compaction size is too high

Closed

HBASE-14387 Compaction improvements: Maximum off-peak compaction size

Closed

HBASE-14388 Compaction improvements: Avoid flush storms by jittering flush interval and max log files

Closed

HBASE-14389 Compaction improvements: ExploringCompactionPolicy selection evaluation algorithm

Closed

HBASE-14467 Compaction improvements: DefaultCompactor should not compact TTL-expired files

Closed

HBASE-14468 Compaction improvements: FIFO compaction policy

Closed

(3 incorporates)

Activity

People

Assignee:: Unassigned

Reporter:: Vladimir Rodionov

Votes:: 0 Vote for this issue

Watchers:: 19 Start watching this issue

Dates

Created:: 09/Sep/15 00:08

Updated:: 01/Jul/22 21:18

Resolved:: 01/Jul/22 21:18