In a test cluster using replicated persistent Regions all of the servers were shut down and restarted. The restart hung showing a cycle in disk store dependencies.
After looking at the logs for all members, the "members with potentially new data" for each member were found to be:
It appears that there is a cycle in this "waiting for another online member" graph between 3 > 4 > 10 > 3.
The problem seems to have cropped up after the fix for
GEODE-7196 was merged. That changed the timing of member-departed notifications such that a server might close a Region's Persistence Advisor before getting notification that another server was shutting down. We used to do this notification upon receipt of a ShutdownMessage but now we only do it when the membership view has changed.