HDFS-10602, we found a failing case that the balancer moves data always between 2 DNs. And it made the balancer can't be finished. I debug the code for this, I found there seems a bug in choosing pending blocks in Dispatcher.Source.chooseNextMove.
The value of task.size was assigned in Balancer#matchSourceWithTargetToMove
This value was depended on the source and target node, and this value will not always can be reduced to 0 in choosing pending blocks. And then, it will still move the data to the target node even if the size of bytes that needed to move has been already reduced less than 0. And finally it will make the data imbalance again in cluster, then it leads the next balancer.
We can opitimize for this as this title mentioned, I think this can speed the balancer.
Can see the logs for failling case, or see the
HDFS-10602.(Concentrating on the change record for the scheduled size of target node. That's my added info for debug, like this).