Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-13515

Redundant transitionToActive call can cause a NameNode to crash

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 2.5.0
    • Fix Version/s: None
    • Component/s: ha
    • Labels:
      None

      Description

      The situation in parts is similar to HADOOP-8217, but the cause is different and so is the result.

      Consider this situation:

      • At the beginning NN1 is Active, NN2 is Standby
      • ZKFC1 faces a ZK disconnect (not a session timeout, just a socket disconnect) and thereby reconnects
      2016-08-11 07:00:46,068 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 4000ms for sessionid 0x4566f0c97500bd9, closing socket connection and attempting reconnect
      2016-08-11 07:00:46,169 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session disconnected. Entering neutral mode...
      …
      2016-08-11 07:00:46,610 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session connected.
      
      • The reconnection on the ZKFC1 triggers the elector code, and the elector re-run finds that NN1 should be the new active (a redundant decision cause NN1 is already active)
      2016-08-11 07:00:46,615 INFO org.apache.hadoop.ha.ActiveStandbyElector: Checking for any old active which needs to be fenced...
      2016-08-11 07:00:46,630 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old node exists: …
      2016-08-11 07:00:46,630 INFO org.apache.hadoop.ha.ActiveStandbyElector: But old node has our own data, so don't need to fence it.
      
      • The ZKFC1 sets the new ZK data, and fires a NN1 RPC call of transitionToActive
      2016-08-11 07:00:46,630 INFO org.apache.hadoop.ha.ActiveStandbyElector: Writing znode /hadoop-ha/nameservice1/ActiveBreadCrumb to indicate that the local node is the most recent active...
      2016-08-11 07:00:46,649 TRACE org.apache.hadoop.ipc.ProtobufRpcEngine: 175: Call -> nn01/10.10.10.10:8022: transitionToActive {reqInfo { reqSource: REQUEST_BY_ZKFC }}
      
      • At the same time as the transitionToActive call is in progress at NN1, but not complete yet, the ZK session of ZKFC1 is timed out by ZK Quorum, and a watch notification is sent to ZKFC2
      2016-08-11 07:01:00,003 DEBUG org.apache.zookeeper.ClientCnxn: Got notification sessionid:0x4566f0c97500bde
      2016-08-11 07:01:00,004 DEBUG org.apache.zookeeper.ClientCnxn: Got WatchedEvent state:SyncConnected type:NodeDeleted path:/hadoop-ha/nameservice1/ActiveStandbyElectorLock for sessionid 0x4566f0c97500bde
      
      • ZKFC2 responds by marking NN2 as standby, which succeeds (NN hasn't handled transitionToActive call yet due to busy status, but has handled transitionToStandby before it)
      2016-08-11 07:01:00,013 INFO org.apache.hadoop.ha.ActiveStandbyElector: Checking for any old active which needs to be fenced...
      2016-08-11 07:01:00,018 INFO org.apache.hadoop.ha.ZKFailoverController: Should fence: NameNode at nn01/10.10.10.10:8022
      2016-08-11 07:01:00,020 TRACE org.apache.hadoop.ipc.ProtobufRpcEngine: 412: Call -> nn01/10.10.10.10:8022: transitionToStandby {reqInfo { reqSource: REQUEST_BY_ZKFC }}
      2016-08-11 07:01:03,880 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: transitionToStandby took 3860ms
      
      • ZKFC2 then marks NN2 as active, and NN2 begins its transition (is in midst of it, not done yet at this point)
      2016-08-11 07:01:03,894 INFO org.apache.hadoop.ha.ZKFailoverController: Trying to make NameNode at nn02/11.11.11.11:8022 active...
      2016-08-11 07:01:03,895 TRACE org.apache.hadoop.ipc.ProtobufRpcEngine: 412: Call -> nn02/11.11.11.11:8022: transitionToActive {reqInfo { reqSource: REQUEST_BY_ZKFC }}
      …
      
      2016-08-11 07:01:09,558 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required for active state
      …
      2016-08-11 07:01:19,968 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing edit logs at txnid 5635
      
      • At the same time in parallel NN1 processes the transitionToActive requests finally, and becomes active
      2016-08-11 07:01:13,281 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required for active state
      …
      2016-08-11 07:01:19,599 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing edit logs at txnid 5635
      …
      2016-08-11 07:01:19,602 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 5635
      
      • NN2's active transition fails as a result of this parallel active transition on NN1 which has completed right before it tries to take over
      2016-08-11 07:01:19,968 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing edit logs at txnid 5635
      2016-08-11 07:01:22,799 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Error encountered requiring NN shutdown. Shutting down immediately.
      java.lang.IllegalStateException: Cannot start writing at txid 5635 when there is a stream available for read: org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@53feee4f
              at org.apache.hadoop.hdfs.server.namenode.FSEditLog.openForWrite(FSEditLog.java:312)
              at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1174)
              at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1654)
              at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
              at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64)
              at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
              at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1527)
              at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1302)
              at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
              at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
              at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
              at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
      

      (Reflects as a failure on the ZKFC2 too):

      2016-08-11 07:01:31,598 DEBUG org.apache.hadoop.ipc.Client: closing ipc connection to nn02/11.11.11.11:8022: null
      java.io.EOFException
              at java.io.DataInputStream.readInt(DataInputStream.java:392)
              at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1055)
              at org.apache.hadoop.ipc.Client$Connection.run(Client.java:950)
      2016-08-11 07:01:31,626 FATAL org.apache.hadoop.ha.ZKFailoverController: Couldn't make NameNode at nn02/11.11.11.11:8022 active
      java.io.IOException: Failed on local exception: java.io.EOFException; Host Details : local host is: "nn02/11.11.11.11"; destination host is: "nn02":8022;
              at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
              at org.apache.hadoop.ipc.Client.call(Client.java:1415)
              at org.apache.hadoop.ipc.Client.call(Client.java:1364)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
              at com.sun.proxy.$Proxy15.transitionToActive(Unknown Source)
              at org.apache.hadoop.ha.protocolPB.HAServiceProtocolClientSideTranslatorPB.transitionToActive(HAServiceProtocolClientSideTranslatorPB.java:100)
              at org.apache.hadoop.ha.HAServiceProtocolHelper.transitionToActive(HAServiceProtocolHelper.java:48)
              at org.apache.hadoop.ha.ZKFailoverController.becomeActive(ZKFailoverController.java:377)
              at org.apache.hadoop.ha.ZKFailoverController.access$900(ZKFailoverController.java:60)
              at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.becomeActive(ZKFailoverController.java:868)
              at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:804)
              at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:415)
              at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
              at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
      Caused by: java.io.EOFException
              at java.io.DataInputStream.readInt(DataInputStream.java:392)
              at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1055)
              at org.apache.hadoop.ipc.Client$Connection.run(Client.java:950)
      2016-08-11 07:01:31,665 WARN org.apache.hadoop.ha.ActiveStandbyElector: Exception handling the winning of election
      org.apache.hadoop.ha.ServiceFailedException: Couldn't transition to active
      

      We should perhaps avoid sending an active RPC if the NN is already active, by checking the service status before firing the transition RPC inside ZKFailoverController.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                qwertymaniac Harsh J
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: