needsBalance() uses an instance of ClusterLoadState to determine how many servers are active in the cluster. By default, "hbase.master.loadbalance.bytable" is false and therefore ClusterLoadState has a global view of the cluster and how many servers there are. If "hbase.master.loadbalance.bytable" was set to true then ClusterLoadState will only have a per table view of the cluster. In this case, if the logic was moved to needsBalance() then the balancer may or may not run for each table, it will definitely not run when the cluster has a single server, but needsBalance() will be called for each table. Is this what you would expect?
Is there any reason why ClusterLoadState and ClusterStatus cannot be merged? At the moment, a new instance of both is created every time the balancer is run and ClusterStatus is only used by StochasticLoadBalancer. Wouldn't it be better to have a single ClusterStatus object that also contains load information, and to call balancer.needsBalance() from HMaster#balance() before balancer.balanceCluster()?
The two other checks in HMaster#balance() could also be moved into balancer.needsBalance() - isRegionsInTransition() and areDeadServersInProgress().