Instead of reverting, I am making a simple change to make it more usable. This will prevent users from hitting the same issues we had. The changes from
HDFS-8188 does allow running balancer at a higher throughput, but it needs turning multiple knobs to get there. And when it is running slower than the previous release, users will have no clue why it is so. The default config values may result in degraded performance for users running a cluster with more than 20 nodes.
The main problem of
HDFS-8188 is the way thread pool is created per target. If it reaches the limit (max mover threads), the remaining pending moves are simply dropped (Or even worse, it hangs without HDFS-11377), leading to degraded performance as demonstrated above with graphs. The suggested workaround of "set the mover thread limit to 10,000 or 30,000" simply means removing the limit. i.e. it cannot work with the limit.
The suggested improvement calculates the size of each mover thread pool, instead of using the configured fixed value. The total thread count limit is honored without causing the degradation seen with the original design.