There are possibility of 2 problems
-> When we populate regionsToMove while iterating the serverinfo in descending manner there is a chance that the same region can be added twice.
Because in the first loop we do a randomization of the regions.
Where as when we get we have neededRegions!= 0 we just get the region in the index and add it again . This may lead to have same region in the regionsToMove list.
-> Another problem is
when the problem in the first point happens then there is a chance that
the regionToMove can have the same src and destination and the same region can be picked every 5 mins.
If i have meta and root in the top two loaded region server(totally 3 RS), we just skip the regions in those region server and populate the region from the least loaded RS.
Then in the next loop we iterate from the least loaded server and populate the destination as also the same server.
This is leading to a condition where every 5 min balancing happens and also the server is same for src and dest.
|Field||Original Value||New Value|
|Summary||Balancer in 0.90 algo leading to same region balanced twice and picking same region with Src and Destination as same RS.||In 0.90, balancer algo leading to same region balanced twice and picking same region with Src and Destination as same RS.|
|Status||Open [ 1 ]||Resolved [ 5 ]|
|Hadoop Flags||Reviewed [ 10343 ]|
|Resolution||Fixed [ 1 ]|