Details
Description
The balancer needs to query for blocks to move from overly full DNs. The block lookup is extremely inefficient. An iterator of the node's blocks is created from the iterators of its storages' blocks. A random number is chosen corresponding to how many blocks will be skipped via the iterator. Each skip requires costly scanning of triplets.
The current design also only considers node imbalances while ignoring imbalances within the nodes's storages. A more efficient and intelligent design may eliminate the costly skipping of blocks via round-robin selection of blocks from the storages based on remaining capacity.
Attachments
Attachments
Issue Links
- depends upon
-
HDFS-7990 IBR delete ack should not be delayed
- Resolved
- is related to
-
HDFS-9260 Improve the performance and GC friendliness of NameNode startup and full block reports
- Resolved
- relates to
-
HDFS-11310 Reduce the performance impact of the balancer (trunk port)
- Open