Varun Saxena, sorry I didn't see your ping. I have gone through few ZooKeeper related comments in the jira. Based on my understanding I will try to help you guys
So I guess even ZKRMStateStore can get inconsistent data if it disconnects with one zookeeper server and connects to other. Not a 100% sure on this though. Maybe a zookeeper guy can chime in on this. Rakesh R, will sync work in this scenario ?
Generally sync() is recommended only when accessing ZooKeeper service from multiple clients. If only one ZooKeeper client is performing the update operation it is not required to call syncup after the connection re-establishment. For example, zkclient connected to ZK1 and successfully created a znode. Now, assume zkclient got disconnected and successfully reconnected to ZK2 the client zkclient will see the same znode in this server also. I agree with Tsuyoshi Ozawa's comments about sync()
FYI: sync() call is a costly operation, internally this will force the connected server to sync up the data from the Leader ZK server. Probably in your logic after creating the new ZooKeeper connection it can do a sync() call before performing any operation.
It looks exhaust, but it's not: reconnection to other ZooKeeper servers are done at ClientCnxn#startConnect in a main thread of ZooKeeper's client. Please note that a session is not equal to a connection in ZooKeeper. What we can do it to retry with current zookeeper client. I also noticed that we shouldn't create new session when SESSIONMOVED occurs.
IMHO it is fine not to recreate a connection on SESSIONMOVED. If someone uses saved sessionid and uses the constructor new ZooKeeper(connectString, sessionTimeout, watcher, sessionId, sessionPasswd), it may give unexpected result. I hope you are not using this one in YARN.
Coming to the patch: By definition, CONNECTIONLOSS also means that we should recreate the connection?
Not required to create a new zkclient connection on errors except SESSIONEXPIRED. Because zkclient internally does connection retries to all the servers that are passed in the ZooKeeper constructor. I have one suggestion, after creating new ZooKeeper connection it can do a sync() call before performing any operation. This way will ensure consistency of data I feel.