Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
Reviewed
-
Tool allows you to compact a cluster with given concurrency of regionservers compacting at a given time. If tool completes successfully everything requested for compaction will be compacted, regardless of region moves, splits and merges.
Description
The basic overview of how this tool works is:
Parameters:
Table
Stores
ClusterConcurrency
Timestamp
So you input a table, desired concurrency and the list of stores you wish to major compact. The tool first checks the filesystem to see which stores need compaction based on the timestamp you provide (default is current time). It takes that list of stores that require compaction and executes those requests concurrently with at most N distinct RegionServers compacting at a given time. Each thread waits for the compaction to complete before moving to the next queue. If a region split, merge or move happens this tool ensures those regions get major compacted as well.
This helps us in two ways, we can limit how much I/O bandwidth we are using for major compaction cluster wide and we are guaranteed after the tool completes that all requested compactions complete regardless of moves, merges and splits.
Attachments
Attachments
Issue Links
- is related to
-
HBASE-19967 Add Major Compaction Tool options for off-peak / on-peak hours
- Resolved