Description
There is a race condition that has existed but has been glossed over to this point (because of our "loose" zk usage).
The ZK server process can be in a state where it will accept the socket connection from our client in master or RS but if we do anything against the server, we get a ConnectionLossException. The ZK client handles this automagically and reconnects properly, as long as we are not aborting when we get this exception.
So this works on the last 0.89 and even with the master rewrite, but as we move towards strict usage of ZK, we should wait for ZK availability before proceeding with startup.
I already have a patch in a local branch and it's working. Will put up a patch soon against new master.