Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-6929

yarn.nodemanager.remote-app-log-dir structure is not scalable

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.7.3
    • Fix Version/s: 3.3.0
    • Component/s: log-aggregation
    • Labels:
      None

      Description

      The current directory structure for yarn.nodemanager.remote-app-log-dir is not scalable. Maximum Subdirectory limit by default is 1048576 (HDFS-6102). With retention yarn.log-aggregation.retain-seconds of 7days, there are more chances LogAggregationService fails to create a new directory with FSLimitException$MaxDirectoryItemsExceededException.

      The current structure is <yarn.nodemanager.remote-app-log-dir>/<user>/logs/<job_name>. This can be improved with adding date as a subdirectory like
      <yarn.nodemanager.remote-app-log-dir>/<user>/logs/<date>/<job_name>

      WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService: Application failed to init aggregation 
      org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 items=1048576 
      at org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2021) 
      at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2072) 
      at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:1841) 
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsRecursively(FSNamesystem.java:4351) 
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4262) 
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4221) 
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4194) 
      at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813) 
      at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600) 
      at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) 
      at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) 
      at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) 
      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) 
      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) 
      at java.security.AccessController.doPrivileged(Native Method) 
      at javax.security.auth.Subject.doAs(Subject.java:415) 
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) 
      at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) 
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.createAppDir(LogAggregationService.java:308) 
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:366) 
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:320) 
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443) 
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67) 
      at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) 
      at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) 
      at java.lang.Thread.run(Thread.java:745) 
      Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 items=1048576 
      at org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2021) 
      at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2072) 
      at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:1841) 
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsRecursively(FSNamesystem.java:4351) 
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4262) 
      

      Thanks to Robert Mancuso for finding this issue.

      ************************************************************************************
      Update:

      New App Log Dir Structure:

      {aggregation_log_root}/{user}/bucket_{suffix}/{bucket1}/{appId}
      
      where suffix is logs or logs-ifile
                 bucket1 is application#getId % 10000
      

       

       

       

       

        Attachments

        1. YARN-6929.1.patch
          44 kB
          Prabhu Joseph
        2. YARN-6929.2.patch
          44 kB
          Prabhu Joseph
        3. YARN-6929.2.patch
          44 kB
          Prabhu Joseph
        4. YARN-6929.3.patch
          45 kB
          Prabhu Joseph
        5. YARN-6929.4.patch
          56 kB
          Prabhu Joseph
        6. YARN-6929.5.patch
          59 kB
          Prabhu Joseph
        7. YARN-6929.6.patch
          59 kB
          Prabhu Joseph
        8. YARN-6929.patch
          25 kB
          Prabhu Joseph
        9. YARN-6929-007.patch
          52 kB
          Prabhu Joseph
        10. YARN-6929-008.patch
          52 kB
          Prabhu Joseph
        11. YARN-6929-009.patch
          52 kB
          Prabhu Joseph
        12. YARN-6929-010.patch
          45 kB
          Prabhu Joseph
        13. YARN-6929-011.patch
          45 kB
          Prabhu Joseph
        14. YARN-6929-branch-3.1.001.patch
          45 kB
          Prabhu Joseph

          Issue Links

            Activity

              People

              • Assignee:
                prabhujoseph Prabhu Joseph
                Reporter:
                prabhujoseph Prabhu Joseph
              • Votes:
                0 Vote for this issue
                Watchers:
                11 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: