Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Invalid
-
3.3.5
-
None
-
None
-
None
-
Server: zookeeper 3.3.5+dfsg1-1ubuntu1
Client: zookeeper 3.4.5 from Cloudera 4.3.0
Description
This happens in the context of HBase master nodes getting connections from HBase region server. Once an HBase region server joins the cluster, I get the following error:
2013-08-07 13:35:18,676 WARN org.apache.zookeeper.ClientCnxn: Session 0xd4058c4d7940003 for server zk-01.dev.dailymotion.com/10.194.60.13:2181, unexpected error, closing socket connection and attempting reconnect java.io.IOException: Xid out of order. Got Xid 56 with err -101 expected Xid 55 for a packet with details: clientPath:null serverPath:null finished:false header:: 55,14 replyHeader:: 0,0,-4 request:: org.apache.zookeeper.MultiTransactionRecord@360193e5 response:: org.apache.zookeeper.MultiResponse@0 at org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:795) at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:94) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) 2013-08-07 13:35:18,676 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss 2013-08-07 13:35:18,676 ERROR org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper multi failed after 3 retries 2013-08-07 13:35:18,677 ERROR org.apache.hadoop.hbase.master.AssignmentManager: Unable to ensure that the table -ROOT- will be enabled because of a ZooKeeper issue 2013-08-07 13:35:18,677 FATAL org.apache.hadoop.hbase.master.HMaster: Master server abort: loaded coprocessors are: [] 2013-08-07 13:35:18,677 FATAL org.apache.hadoop.hbase.master.HMaster: Unable to ensure that the table -ROOT- will be enabled because of a ZooKeeper issue org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:931) at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:911) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.multi(RecoverableZooKeeper.java:531) at org.apache.hadoop.hbase.zookeeper.ZKUtil.multiOrSequential(ZKUtil.java:1440) at org.apache.hadoop.hbase.zookeeper.ZKTable.setTableState(ZKTable.java:245) at org.apache.hadoop.hbase.zookeeper.ZKTable.setEnabledTable(ZKTable.java:325) at org.apache.hadoop.hbase.master.AssignmentManager.setEnabledTable(AssignmentManager.java:3576) at org.apache.hadoop.hbase.master.AssignmentManager.setEnabledTable(AssignmentManager.java:2340) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1674) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394) at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105) at org.apache.hadoop.hbase.master.AssignmentManager.addToRITandCallClose(AssignmentManager.java:675) at org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:586) at org.apache.hadoop.hbase.master.AssignmentManager.processRegionInTransition(AssignmentManager.java:525) at org.apache.hadoop.hbase.master.AssignmentManager.processRegionInTransitionAndBlockUntilAssigned(AssignmentManager.java:489) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:679) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:583) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:395) at java.lang.Thread.run(Thread.java:722) 2013-08-07 13:35:18,678 INFO org.apache.hadoop.hbase.master.HMaster: Aborting 2013-08-07 13:35:18,678 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server stopped; skipping assign of -ROOT-,,0.70236052 state=OFFLINE, ts=1375881792131, server=null 2013-08-07 13:35:18,678 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Waiting on 70236052/-ROOT- 2013-08-07 13:35:18,678 INFO org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor: masternode-01.dev.dailymotion.com,60000,1375880747185.timeoutMonitor exiting 2013-08-07 13:35:18,679 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=masternode-01.dev.dailymotion.com,60000,1375880747185, region=70236052/-ROOT-, which is more than 15 seconds late 2013-08-07 13:35:18,776 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/root-region-server 2013-08-07 13:35:18,776 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase 2013-08-07 13:35:18,776 ERROR org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper getData failed after 3 retries 2013-08-07 13:35:18,777 INFO org.apache.hadoop.hbase.util.RetryCounter: Sleeping 2000ms before retry #1...