Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-2492 (Clone of YARN-796) Allow for (admin) labels on nodes and resource-requests
  3. YARN-3930

FileSystemNodeLabelsStore should make sure edit log file closed when exception is thrown

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      When I test the node label feature in my local environment, I encountered the following exception:

      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2426)
              at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2222)
              at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:2523)
              at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2498)
              at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:662)
              at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:418)
              at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636)
              at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2174)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2170)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666)
              at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2168)
      
              at org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.handleStoreEvent(CommonNodeLabelsManager.java:196)
              at org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager$ForwardingEventHandler.handle(CommonNodeLabelsManager.java:168)
              at org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager$ForwardingEventHandler.handle(CommonNodeLabelsManager.java:163)
              at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:176)
              at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108)
              at java.lang.Thread.run(Thread.java:745)
      

      The reason is that HDFS throws an exception when calling ensureAppendEditlogFile because of some reason which causes the edit log output stream isn't closed. This caused that the next time we call ensureAppendEditlogFile, lease recovery will failed because we are just the lease holder.

      Attachments

        1. YARN-3930.001.patch
          3 kB
          Dian Fu

        Activity

          People

            dian.fu Dian Fu
            dian.fu Dian Fu
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: