Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-9905

yarn-service is failed to setup application log if app-log-dir is not default-fs

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Currently, yarn-service takes a token of default namenode only.
      it might cause authentication failure under HDFS federation.

      how to reproduce

      • kerberized cluster
      • multiple namespaces by HDFS federation.
      • yarn.nodemanager.remote-app-log-dir is set to a namespace that is not default-fs

      here are the nodemanager logs at that time.

      2019-10-15 11:52:50,217 INFO  containermanager.ContainerManagerImpl (ContainerManagerImpl.java:startContainerInternal(1122)) - Creating a new application reference for app application_1569373267731_9571
      2019-10-15 11:52:50,217 INFO  application.ApplicationImpl (ApplicationImpl.java:handle(655)) - Application application_1569373267731_9571 transitioned from NEW to INITING
      ...
      
       Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
              at sun.reflect.GeneratedConstructorAccessor45.newInstance(Unknown Source)
              at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
              at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
              at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
              at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:806)
              at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1515)
              at org.apache.hadoop.ipc.Client.call(Client.java:1457)
              at org.apache.hadoop.ipc.Client.call(Client.java:1367)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
              at com.sun.proxy.$Proxy24.getFileInfo(Unknown Source)
              at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:900)
              at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
              at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
              at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
              at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
              at com.sun.proxy.$Proxy25.getFileInfo(Unknown Source)
              at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1660)
              at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1583)
              at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1580)
              at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
              at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1595)
              at org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileController.checkExists(LogAggregationFileController.java:396)
              at org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileController$1.run(LogAggregationFileController.java:338)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:422)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
              at org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileController.createAppDir(LogAggregationFileController.java:323)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:254)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:204)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:347)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:69)
              at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
              at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
              at java.lang.Thread.run(Thread.java:745)
      Caused by: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
              at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:760)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:422)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
              at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:723)
              at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:817)
              at org.apache.hadoop.ipc.Client$Connection.access$3700(Client.java:411)
              at org.apache.hadoop.ipc.Client.getConnection(Client.java:1572)
              at org.apache.hadoop.ipc.Client.call(Client.java:1403)
              ... 32 more
      ...
      2019-10-15 11:55:26,184 INFO  application.ApplicationImpl (ApplicationImpl.java:transition(463)) - Adding container_e51_1569373267731_9571_01_000001 to application application_1569373267731_9571
      2019-10-15 11:55:26,184 WARN  application.ApplicationImpl (ApplicationImpl.java:transition(441)) - Log Aggregation service failed to initialize, there will be no logs for this application
      

      I think it should be able to obtain tokens of another namenodes.
      It like tez.job.fs-servers in tez or mapreduce.job.hdfs-servers in MR.

      Attachments

        1. YARN-9905.002.patch
          3 kB
          kyungwan nam
        2. YARN-9905.001.patch
          3 kB
          kyungwan nam

        Issue Links

          Activity

            People

              kyungwan nam kyungwan nam
              kyungwan nam kyungwan nam
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: