Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.0.0-alpha
-
None
-
None
Description
Format and ZKFC startup flows continue further after creation of zkclient connection without waiting to check whether the connection is completely established. This leads to failure at the subsequent point if connection was not complete by then.
Exception trace for format
12/05/30 19:48:24 INFO zookeeper.ClientCnxn: Socket connection established to HOST-xx-xx-xx-55/xx.xx.xx.55:2182, initiating session 12/05/30 19:48:24 INFO zookeeper.ClientCnxn: Session establishment complete on server HOST-xx-xx-xx-55/xx.xx.xx.55:2182, sessionid = 0x1379da4660c0014, negotiated timeout = 5000 12/05/30 19:48:24 WARN ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x1379da4660c0014 12/05/30 19:48:24 INFO zookeeper.ZooKeeper: Session: 0x1379da4660c0014 closed 12/05/30 19:48:24 INFO zookeeper.ClientCnxn: EventThread shut down Exception in thread "main" java.io.IOException: Couldn't determine existence of znode '/hadoop-ha/hacluster' at org.apache.hadoop.ha.ActiveStandbyElector.parentZNodeExists(ActiveStandbyElector.java:263) at org.apache.hadoop.ha.ZKFailoverController.formatZK(ZKFailoverController.java:257) at org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:195) at org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:58) at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:163) at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:159) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:438) at org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:159) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:171) Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hadoop-ha/hacluster at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1049) at org.apache.hadoop.ha.ActiveStandbyElector.parentZNodeExists(ActiveStandbyElector.java:261) ... 8 more
Attachments
Attachments
Issue Links
- depends upon
-
HADOOP-8591 TestZKFailoverController tests time out
- Resolved