Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Invalid
-
None
-
None
-
None
Description
I think this is part of the Master/ZooKeeper refactoring project but I'm putting it up here to be sure we cover it. Currently in ZKW (and other places around the code base) we do ZK operations and we don't really handle the exceptions, for example in ZKW.setClusterState:
} catch (InterruptedException e) { LOG.warn("<" + instanceName + ">" + "Failed to set state node in ZooKeeper", e); } catch (KeeperException e) { if(e.code() == KeeperException.Code.NODEEXISTS) { LOG.debug("<" + instanceName + ">" + "State node exists."); } else { LOG.warn("<" + instanceName + ">" + "Failed to set state node in ZooKeeper", e); }
This has been always like that since we started using ZK.
What if the session was expired? What if it was only the connection that had a blip? Do we handle it correctly? We need to have this discussion.
Attachments
Issue Links
- relates to
-
HBASE-2223 Handle 10min+ network partitions between clusters
- Closed