One doubt I have.
After this patch, Balancer's source and target have become storageGroups instead of datanode itself.
i.e. candidates for balancing are now storageGroups not datanode.
One thing we need to consider here is, Blocks in one storageGroups should not be moved to other storageGroup while balancing.
For ex: Blocks in SSD (Over utilized in one node) should not be moved to DISK(under utilized in another node)
In Current patch,
Utilization calculation is done separately for each of the storageType.
But I haven't seen this condition while matching the targets for movement.
I feel a check for the storageType also required while selecting the target candidate. Am I right?
Also, following nit needs to be fixed.
private StorageGroup(DatanodeStorageReport r, StorageType storageType,
double utilization, long maxSize2Move)
Here r is not used. I feel this and related changes can be removed.