I have some doubts and like to get some suggestion before proceeding.
Following scenarios needs to be considered.
All the regions are disabled and the state in zookeeper is DISABLED.
The regions are offlined but the AM went down when the zookeeper state was DISABLING.
The regions are not yet offlined(or only few regions are offlined) and the AM went down when the zookeeper state was DISABLING.
Now when we do a switch of the master or on restart scenario of master,
how can we decide which regions were offlined and which are not.
Though we can get the state of the table as either DISABLED or DISABLING, region wise i am not able to infer in what state the region is.
So what brings me to get this info is
The soln should be like we need to check for the state of the table while populating the regions map in master startup.
Checking only for DISABLED state:
Check for disabled state and those regions that are not in the DISABLED state add it into the regions map in master startup.
If i check only for the DISABLED state and if the table is in DISABLING state and
after master retry (or switch) if i try to enable then we will not be able to scan the table because while enabling none of the regions will be enabled
as the regions in META table and the regions that i have populated in the regions map are same.
So I will be getting the same issue as in the description of the defect.
Checking for DISABLED and DISABLING state:
if i check the state of the zookeeper for DISABLED and DISABLING and while restart of master(switch) only those regions which are not in DISABLED or DISABLING state is populated.
When i again try to enable the region if the region was not offlined as part of disable flow(Scenario:3), the waitUntilDone in BulkAssigner is not aware that the region was
already onlined and keeps on waiting as the waitUntilDone() sees for the number of regions to become online from the regions map and the actual count it gets from the meta table.
This makes enable to go in a loop.
Am i clear with the problem? so is it like before enabling any table do we need to check the state of the table and if it is DISABLING make all those regions to go to