If all the RSes in a RSgroup hosting user tables fail and recover, master still looks for old RSes (with old timestamp in the RS identifier) to assign regions. i.e. Regions are left in transition making the tables in the RSGroup unavailable. User need to restart master or manually assign the regions to make the tables available. Steps to recreate the scenario in a local cluster
- Add required properties to site.xml to enable rsgroup and start hbase
- Bring up multiple region servers using local-regionservers.sh start
- Create a rsgroup and move a subset of regionservers to the group
- Create a table, move it to the group and put some data
- Stop the regionservers in the group and restart them
- From the master UI, we can see that the region for the table in transition and the RS name in the RIT message has the old timestamp.