Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6745

Job directories should be clean in staging directorg /tmp/hadoop-yarn/staging after MapReduce job finish successfully

Add voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Blocker
    • Resolution: Unresolved
    • 2.7.2
    • None
    • mr-am
    • None
    • Suse 11 sp3

    Description

      If MapReduce client set mapreduce.task.files.preserve.failedtasks=true, temporary job directory will not be deleted in staging directory /tmp/hadoop-yarn/staging.
      As time goes by, the job files are more and more, eventually lead to below exeception:

      org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemExceededException):
      The directory item limit of /tmp/hadoop-yarn/staging/username/.staging is exceeded: limit=1048576 items=1048576
      at org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:936)
      at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addLastINode(FSDirectory.java:981)
      at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.unprotectedMkdir(FSDirMkdirOp.java:237)
      at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createSingleDirectory(FSDirMkdirOp.java:191)
      at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createChildrenDirectories(FSDirMkdirOp.java:166)
      at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:97)
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3788)
      at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:986)
      at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:624)
      at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolProtos.$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:624)
      at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
      at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:973)
      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2088)
      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2084)
      at java.security.auth.Subject.doAs(Subject.java:422)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1672)
      at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2082)

      The official instructions for the configuration mapreduce.task.files.preserve.failedtasks is below:
      Should the files for failed tasks be kept. This should only be used on jobs that are failing, because the storage is never reclaimed.
      It also prevents the map outputs from being erased from the reduce directory as they are consumed.

      According to the instructions, I think the temporary files for successful tasks shouldn't be kept.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            xiaoping liuxiaoping

            Dates

              Created:
              Updated:

              Slack

                Issue deployment