Description
When running RegionSplitter on a 100-node cluster with 900 regions (and plenty of data), the utility took around 72 hours to run. Analysis revealed two major bottlenecks:
1. We are serialized on the logical split (i.e. waiting for the split request to be registered). Parallelizing this step will align configured and actual outstanding splits.
2. Outstanding splits are modeled like a queue. Changing this to a list with a scanner will allow handling splits that finish out of order.
Attachments
Attachments
Issue Links
- depends upon
-
HBASE-3653 Parallelize Server Requests on HBase Client
- Closed