Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6410

Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.7.0
    • Fix Version/s: 2.8.0, 2.7.1, 3.0.0-alpha1
    • Component/s: None
    • Labels:
    • Environment:

      mrV2, secure mode

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      GSSException is thrown everytime log aggregation deletion is attempted after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure cluster.

      The problem can be reproduced by following steps:
      1. startup historyserver in secure cluster.
      2. Log deletion happens as per expectation.
      3. execute mapred hsadmin -refreshLogRetentionSettings command to refresh the configuration value.
      4. All the subsequent attempts of log deletion fail with GSSException

      Following exception can be found in historyserver's log if log deletion is enabled.

      2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this deletion attempt is being aborted | AggregatedLogDeletionService.java:127
      java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; destination host is: "vm-33":25000; 
              at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
              at org.apache.hadoop.ipc.Client.call(Client.java:1414)
              at org.apache.hadoop.ipc.Client.call(Client.java:1363)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
              at com.sun.proxy.$Proxy9.getListing(Unknown Source)
              at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
              at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:606)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
              at com.sun.proxy.$Proxy10.getListing(Unknown Source)
              at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
              at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
              at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
              at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
              at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
              at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
              at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
              at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
              at org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
              at java.util.TimerThread.mainLoop(Timer.java:555)
              at java.util.TimerThread.run(Timer.java:505)
      Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
              at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
              at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640)
              at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:724)
              at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
              at org.apache.hadoop.ipc.Client.getConnection(Client.java:1462)
              at org.apache.hadoop.ipc.Client.call(Client.java:1381)
              ... 21 more
      Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
              at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
              at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:411)
              at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:550)
              at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:367)
              at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:716)
              at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:712)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
              at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:711)
              ... 24 more
      Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
              at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
              at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
              at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
              at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
              at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
              at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
              at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
              ... 33 more
      
      1. YARN-3779.03.patch
        3 kB
        Varun Saxena
      2. YARN-3779.02.patch
        5 kB
        Varun Saxena
      3. YARN-3779.01.patch
        5 kB
        Varun Saxena
      4. MAPREDUCE-6410.05.patch
        7 kB
        Vinod Kumar Vavilapalli
      5. MAPREDUCE-6410.04.patch
        7 kB
        Varun Saxena
      6. log_aggr_deletion_on_refresh_fix.log
        294 kB
        Varun Saxena
      7. log_aggr_deletion_on_refresh_error.log
        375 kB
        Varun Saxena

        Activity

        Hide
        varun_saxena Varun Saxena added a comment -

        Able to simulate the issue. Deletion fails always (due to GSS API Exception) after refreshing log retention settings.
        Will look into it.

        Show
        varun_saxena Varun Saxena added a comment - Able to simulate the issue. Deletion fails always (due to GSS API Exception) after refreshing log retention settings. Will look into it.
        Hide
        varun_saxena Varun Saxena added a comment -

        Moved it to YARN as code changes will be in yarn-common

        Show
        varun_saxena Varun Saxena added a comment - Moved it to YARN as code changes will be in yarn-common
        Hide
        varun_saxena Varun Saxena added a comment -

        Instead of adopting the approach followed in the patch, issue can also be fixed by using a ScheduledThreadPoolExecutor instead of a Timer

        Show
        varun_saxena Varun Saxena added a comment - Instead of adopting the approach followed in the patch, issue can also be fixed by using a ScheduledThreadPoolExecutor instead of a Timer
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 15m 58s Pre-patch trunk compilation is healthy.
        +1 @author 0m 0s The patch does not contain any @author tags.
        -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
        +1 javac 7m 33s There were no new javac warning messages.
        +1 javadoc 9m 36s There were no new javadoc warning messages.
        +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
        +1 checkstyle 0m 49s There were no new checkstyle issues.
        -1 whitespace 0m 0s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix.
        +1 install 1m 34s mvn install still works.
        +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
        +1 findbugs 1m 33s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
        +1 yarn tests 1m 56s Tests passed in hadoop-yarn-common.
            39m 58s  



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12738240/YARN-3779.01.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / b61b489
        whitespace https://builds.apache.org/job/PreCommit-YARN-Build/8211/artifact/patchprocess/whitespace.txt
        hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8211/artifact/patchprocess/testrun_hadoop-yarn-common.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8211/testReport/
        Java 1.7.0_55
        uname Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/8211/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 15m 58s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac 7m 33s There were no new javac warning messages. +1 javadoc 9m 36s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 49s There were no new checkstyle issues. -1 whitespace 0m 0s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 34s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 1m 33s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 1m 56s Tests passed in hadoop-yarn-common.     39m 58s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12738240/YARN-3779.01.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / b61b489 whitespace https://builds.apache.org/job/PreCommit-YARN-Build/8211/artifact/patchprocess/whitespace.txt hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8211/artifact/patchprocess/testrun_hadoop-yarn-common.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8211/testReport/ Java 1.7.0_55 uname Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8211/console This message was automatically generated.
        Hide
        varun_saxena Varun Saxena added a comment -

        Fixed whitespace issue

        Show
        varun_saxena Varun Saxena added a comment - Fixed whitespace issue
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 16m 7s Pre-patch trunk compilation is healthy.
        +1 @author 0m 0s The patch does not contain any @author tags.
        -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
        +1 javac 7m 36s There were no new javac warning messages.
        +1 javadoc 9m 35s There were no new javadoc warning messages.
        +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
        +1 checkstyle 0m 56s There were no new checkstyle issues.
        +1 whitespace 0m 0s The patch has no lines that end in whitespace.
        +1 install 1m 34s mvn install still works.
        +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
        +1 findbugs 1m 35s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
        +1 yarn tests 1m 57s Tests passed in hadoop-yarn-common.
            40m 19s  



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12738254/YARN-3779.02.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / b61b489
        hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8212/artifact/patchprocess/testrun_hadoop-yarn-common.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8212/testReport/
        Java 1.7.0_55
        uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/8212/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 16m 7s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac 7m 36s There were no new javac warning messages. +1 javadoc 9m 35s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 56s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 34s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 1m 35s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 1m 57s Tests passed in hadoop-yarn-common.     40m 19s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12738254/YARN-3779.02.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / b61b489 hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8212/artifact/patchprocess/testrun_hadoop-yarn-common.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8212/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8212/console This message was automatically generated.
        Hide
        zjshen Zhijie Shen added a comment -

        So the problem is after refreshing, the deletion task is scheduled and executed by the ugi of who executes the refreshing command, right?

        Show
        zjshen Zhijie Shen added a comment - So the problem is after refreshing, the deletion task is scheduled and executed by the ugi of who executes the refreshing command, right?
        Hide
        varun_saxena Varun Saxena added a comment -

        Zhijie Shen, thanks for looking at this.
        Its the same user which is used for both starting the history server and for executing the refresh command.
        Timer will create a new thread on refresh and from then on, problem occurs.

        There is no problem if I use a ScheduledThreadPoolExecutor(with 1 thread) instead as that doesn't spawn a new thread.
        So it seems the new thread doesn't take the correct UGI.

        Are you able to simulate the issue ?
        I hope there is no issue in the way Kerberos has been set up in my cluster.

        Show
        varun_saxena Varun Saxena added a comment - Zhijie Shen , thanks for looking at this. Its the same user which is used for both starting the history server and for executing the refresh command. Timer will create a new thread on refresh and from then on, problem occurs. There is no problem if I use a ScheduledThreadPoolExecutor(with 1 thread) instead as that doesn't spawn a new thread. So it seems the new thread doesn't take the correct UGI. Are you able to simulate the issue ? I hope there is no issue in the way Kerberos has been set up in my cluster.
        Hide
        zjshen Zhijie Shen added a comment -

        No, I didn't simulate the problem. Just have a quick glance at the code. Log retention refresh will reschedule the deletion task, but this is done in the rpc call by the request user. So I'm not wondering if this changes the ug of the following deletion task. Can you try to print the ugi? Then, we can see what is changed.

        Show
        zjshen Zhijie Shen added a comment - No, I didn't simulate the problem. Just have a quick glance at the code. Log retention refresh will reschedule the deletion task, but this is done in the rpc call by the request user. So I'm not wondering if this changes the ug of the following deletion task. Can you try to print the ugi? Then, we can see what is changed.
        Hide
        varun_saxena Varun Saxena added a comment -

        Zhijie Shen, GSSException was thrown while calling evaluateChallenge in SaslRpcClient.java
        I had printed the DEBUG logs when I tested this(at the history server side). It seems correct UGI is taken but still error comes.

        Below are the logs when error occurs after refresh of log retention settings.

        2015-06-05 22:49:24,541 INFO IPC Server handler 0 on 10033  org.apache.hadoop.mapreduce.v2.hs.HSAuditLogger: USER=hdfs	IP=10.19.92.82	OPERATION=refreshLogRetentionSettings	TARGET=HSAdminServer	RESULT=SUCCESS
        ...
        2015-06-05 22:50:04,541 INFO Timer-3  org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated log deletion started.
        2015-06-05 22:49:24,550 DEBUG Timer-3  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.use.legacy.blockreader.local = false
        2015-06-05 22:49:24,550 DEBUG Timer-3  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.read.shortcircuit = false
        2015-06-05 22:49:24,550 DEBUG Timer-3  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.domain.socket.data.traffic = false
        2015-06-05 22:49:24,550 DEBUG Timer-3  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.domain.socket.path = 
        2015-06-05 22:49:24,550 DEBUG Timer-3  org.apache.hadoop.hdfs.DFSClient: Sets dfs.client.block.write.replace-datanode-on-failure.replication to 0
        2015-06-05 22:49:24,552 DEBUG Timer-3  org.apache.hadoop.hdfs.HAUtil: No HA service delegation token found for logical URI hdfs://hacluster
        2015-06-05 22:49:24,552 DEBUG Timer-3  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.use.legacy.blockreader.local = false
        2015-06-05 22:49:24,552 DEBUG Timer-3  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.read.shortcircuit = false
        2015-06-05 22:49:24,552 DEBUG Timer-3  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.domain.socket.data.traffic = false
        2015-06-05 22:49:24,552 DEBUG Timer-3  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.domain.socket.path = 
        2015-06-05 22:49:24,552 DEBUG Timer-3  org.apache.hadoop.io.retry.RetryUtils: multipleLinearRandomRetry = null
        2015-06-05 22:49:24,553 DEBUG Timer-3  org.apache.hadoop.ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@28194a50
        2015-06-05 22:49:24,554 DEBUG Timer-3  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil: DataTransferProtocol using SaslPropertiesResolver, configured QOP dfs.data.transfer.protection = authentication, configured class dfs.data.transfer.saslproperties.resolver.class = class org.apache.hadoop.security.SaslPropertiesResolver
        2015-06-05 22:49:24,554 DEBUG Timer-3  org.apache.hadoop.ipc.Client: The ping interval is 60000 ms.
        2015-06-05 22:50:04,542 DEBUG Timer-3  org.apache.hadoop.ipc.Client: Connecting to host-10-19-92-88/10.19.92.88:65110
        2015-06-05 22:50:04,543 DEBUG Timer-3  org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:749)
        2015-06-05 22:50:04,544 DEBUG Timer-3  org.apache.hadoop.security.SaslRpcClient: Get kerberos info proto:interface org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB info:@org.apache.hadoop.security.KerberosInfo(clientPrincipal=, serverPrincipal=dfs.namenode.kerberos.principal)
        2015-06-05 22:50:04,545 DEBUG Timer-3  org.apache.hadoop.security.SaslRpcClient: getting serverKey: dfs.namenode.kerberos.principal conf value: hdfs/huawei@HADOOP.COM principal: hdfs/huawei@HADOOP.COM
        2015-06-05 22:50:04,545 DEBUG Timer-3  org.apache.hadoop.security.SaslRpcClient: RPC Server's Kerberos principal name for protocol=org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB is hdfs/huawei@HADOOP.COM
        2015-06-05 22:50:04,545 DEBUG Timer-3  org.apache.hadoop.security.SaslRpcClient: Creating SASL GSSAPI(KERBEROS)  client to authenticate to service at huawei
        2015-06-05 22:50:04,546 DEBUG Timer-3  org.apache.hadoop.security.SaslRpcClient: Use KERBEROS authentication for protocol ClientNamenodeProtocolPB
        2015-06-05 22:50:04,547 DEBUG Timer-3  org.apache.hadoop.security.UserGroupInformation: PrivilegedActionException as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
        javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
        	at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) 
        

        And these are the logs before error i.e. when everything was working fine.

        2015-06-05 22:49:16,989 INFO Timer-2  org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated log deletion started.
        
        2015-06-05 22:49:17,055 DEBUG Timer-2  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.use.legacy.blockreader.local = false
        2015-06-05 22:49:17,055 DEBUG Timer-2  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.read.shortcircuit = false
        2015-06-05 22:49:17,055 DEBUG Timer-2  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.domain.socket.data.traffic = false
        2015-06-05 22:49:17,055 DEBUG Timer-2  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.domain.socket.path = 
        2015-06-05 22:49:17,056 DEBUG Timer-2  org.apache.hadoop.hdfs.DFSClient: Sets dfs.client.block.write.replace-datanode-on-failure.replication to 0
        2015-06-05 22:49:17,057 DEBUG Timer-2  org.apache.hadoop.hdfs.HAUtil: No HA service delegation token found for logical URI hdfs://hacluster
        2015-06-05 22:49:17,057 DEBUG Timer-2  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.use.legacy.blockreader.local = false
        2015-06-05 22:49:17,057 DEBUG Timer-2  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.read.shortcircuit = false
        2015-06-05 22:49:17,057 DEBUG Timer-2  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.domain.socket.data.traffic = false
        2015-06-05 22:49:17,057 DEBUG Timer-2  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.domain.socket.path = 
        2015-06-05 22:49:17,057 DEBUG Timer-2  org.apache.hadoop.io.retry.RetryUtils: multipleLinearRandomRetry = null
        2015-06-05 22:49:17,057 DEBUG Timer-2  org.apache.hadoop.ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@28194a50
        2015-06-05 22:49:17,059 DEBUG Timer-2  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil: DataTransferProtocol using SaslPropertiesResolver, configured QOP dfs.data.
        
        2015-06-05 22:49:17,061 DEBUG IPC Parameter Sending Thread #0  org.apache.hadoop.ipc.Client: IPC Client (1125964210) connection to /10.19.92.88:65110 from hdfs/huawei@HADOOP.COM sending #3
        2015-06-05 22:49:17,062 DEBUG IPC Client (1125964210) connection to /10.19.92.88:65110 from hdfs/huawei@HADOOP.COM  org.apache.hadoop.ipc.Client: IPC Client (1125964210) connection to /10.19.92.88:65110 from hdfs/huawei@HADOOP.COM got value #3
        2015-06-05 22:49:17,063 DEBUG Timer-2  org.apache.hadoop.ipc.ProtobufRpcEngine: Call: getListing took 2ms
        2015-06-05 22:49:17,065 INFO Timer-2  org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated log deletion finished.
        
        Show
        varun_saxena Varun Saxena added a comment - Zhijie Shen , GSSException was thrown while calling evaluateChallenge in SaslRpcClient.java I had printed the DEBUG logs when I tested this(at the history server side). It seems correct UGI is taken but still error comes. Below are the logs when error occurs after refresh of log retention settings. 2015-06-05 22:49:24,541 INFO IPC Server handler 0 on 10033 org.apache.hadoop.mapreduce.v2.hs.HSAuditLogger: USER=hdfs IP=10.19.92.82 OPERATION=refreshLogRetentionSettings TARGET=HSAdminServer RESULT=SUCCESS ... 2015-06-05 22:50:04,541 INFO Timer-3 org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated log deletion started. 2015-06-05 22:49:24,550 DEBUG Timer-3 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.use.legacy.blockreader.local = false 2015-06-05 22:49:24,550 DEBUG Timer-3 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.read.shortcircuit = false 2015-06-05 22:49:24,550 DEBUG Timer-3 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.domain.socket.data.traffic = false 2015-06-05 22:49:24,550 DEBUG Timer-3 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.domain.socket.path = 2015-06-05 22:49:24,550 DEBUG Timer-3 org.apache.hadoop.hdfs.DFSClient: Sets dfs.client.block.write.replace-datanode-on-failure.replication to 0 2015-06-05 22:49:24,552 DEBUG Timer-3 org.apache.hadoop.hdfs.HAUtil: No HA service delegation token found for logical URI hdfs://hacluster 2015-06-05 22:49:24,552 DEBUG Timer-3 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.use.legacy.blockreader.local = false 2015-06-05 22:49:24,552 DEBUG Timer-3 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.read.shortcircuit = false 2015-06-05 22:49:24,552 DEBUG Timer-3 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.domain.socket.data.traffic = false 2015-06-05 22:49:24,552 DEBUG Timer-3 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.domain.socket.path = 2015-06-05 22:49:24,552 DEBUG Timer-3 org.apache.hadoop.io.retry.RetryUtils: multipleLinearRandomRetry = null 2015-06-05 22:49:24,553 DEBUG Timer-3 org.apache.hadoop.ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@28194a50 2015-06-05 22:49:24,554 DEBUG Timer-3 org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil: DataTransferProtocol using SaslPropertiesResolver, configured QOP dfs.data.transfer.protection = authentication, configured class dfs.data.transfer.saslproperties.resolver.class = class org.apache.hadoop.security.SaslPropertiesResolver 2015-06-05 22:49:24,554 DEBUG Timer-3 org.apache.hadoop.ipc.Client: The ping interval is 60000 ms. 2015-06-05 22:50:04,542 DEBUG Timer-3 org.apache.hadoop.ipc.Client: Connecting to host-10-19-92-88/10.19.92.88:65110 2015-06-05 22:50:04,543 DEBUG Timer-3 org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:749) 2015-06-05 22:50:04,544 DEBUG Timer-3 org.apache.hadoop.security.SaslRpcClient: Get kerberos info proto:interface org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB info:@org.apache.hadoop.security.KerberosInfo(clientPrincipal=, serverPrincipal=dfs.namenode.kerberos.principal) 2015-06-05 22:50:04,545 DEBUG Timer-3 org.apache.hadoop.security.SaslRpcClient: getting serverKey: dfs.namenode.kerberos.principal conf value: hdfs/huawei@HADOOP.COM principal: hdfs/huawei@HADOOP.COM 2015-06-05 22:50:04,545 DEBUG Timer-3 org.apache.hadoop.security.SaslRpcClient: RPC Server's Kerberos principal name for protocol=org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB is hdfs/huawei@HADOOP.COM 2015-06-05 22:50:04,545 DEBUG Timer-3 org.apache.hadoop.security.SaslRpcClient: Creating SASL GSSAPI(KERBEROS) client to authenticate to service at huawei 2015-06-05 22:50:04,546 DEBUG Timer-3 org.apache.hadoop.security.SaslRpcClient: Use KERBEROS authentication for protocol ClientNamenodeProtocolPB 2015-06-05 22:50:04,547 DEBUG Timer-3 org.apache.hadoop.security.UserGroupInformation: PrivilegedActionException as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) And these are the logs before error i.e. when everything was working fine. 2015-06-05 22:49:16,989 INFO Timer-2 org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated log deletion started. 2015-06-05 22:49:17,055 DEBUG Timer-2 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.use.legacy.blockreader.local = false 2015-06-05 22:49:17,055 DEBUG Timer-2 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.read.shortcircuit = false 2015-06-05 22:49:17,055 DEBUG Timer-2 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.domain.socket.data.traffic = false 2015-06-05 22:49:17,055 DEBUG Timer-2 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.domain.socket.path = 2015-06-05 22:49:17,056 DEBUG Timer-2 org.apache.hadoop.hdfs.DFSClient: Sets dfs.client.block.write.replace-datanode-on-failure.replication to 0 2015-06-05 22:49:17,057 DEBUG Timer-2 org.apache.hadoop.hdfs.HAUtil: No HA service delegation token found for logical URI hdfs://hacluster 2015-06-05 22:49:17,057 DEBUG Timer-2 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.use.legacy.blockreader.local = false 2015-06-05 22:49:17,057 DEBUG Timer-2 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.read.shortcircuit = false 2015-06-05 22:49:17,057 DEBUG Timer-2 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.domain.socket.data.traffic = false 2015-06-05 22:49:17,057 DEBUG Timer-2 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.domain.socket.path = 2015-06-05 22:49:17,057 DEBUG Timer-2 org.apache.hadoop.io.retry.RetryUtils: multipleLinearRandomRetry = null 2015-06-05 22:49:17,057 DEBUG Timer-2 org.apache.hadoop.ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@28194a50 2015-06-05 22:49:17,059 DEBUG Timer-2 org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil: DataTransferProtocol using SaslPropertiesResolver, configured QOP dfs.data. 2015-06-05 22:49:17,061 DEBUG IPC Parameter Sending Thread #0 org.apache.hadoop.ipc.Client: IPC Client (1125964210) connection to /10.19.92.88:65110 from hdfs/huawei@HADOOP.COM sending #3 2015-06-05 22:49:17,062 DEBUG IPC Client (1125964210) connection to /10.19.92.88:65110 from hdfs/huawei@HADOOP.COM org.apache.hadoop.ipc.Client: IPC Client (1125964210) connection to /10.19.92.88:65110 from hdfs/huawei@HADOOP.COM got value #3 2015-06-05 22:49:17,063 DEBUG Timer-2 org.apache.hadoop.ipc.ProtobufRpcEngine: Call: getListing took 2ms 2015-06-05 22:49:17,065 INFO Timer-2 org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated log deletion finished.
        Hide
        varun_saxena Varun Saxena added a comment -

        Sorry the correct sequence of error logs is as under. After first GSSException, client i.e. historyserver keeps on retrying before giving up.

        2015-06-05 22:49:24,541 INFO Timer-3  org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated log deletion started.
        2015-06-05 22:49:24,541 INFO IPC Server handler 0 on 10033  org.apache.hadoop.mapreduce.v2.hs.HSAuditLogger: USER=hdfs	IP=10.19.92.82	OPERATION=refreshLogRetentionSettings	TARGET=HSAdminServer	RESULT=SUCCESS
        2015-06-05 22:49:24,550 DEBUG Timer-3  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.use.legacy.blockreader.local = false
        2015-06-05 22:49:24,550 DEBUG Timer-3  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.read.shortcircuit = false
        2015-06-05 22:49:24,550 DEBUG Timer-3  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.domain.socket.data.traffic = false
        2015-06-05 22:49:24,550 DEBUG Timer-3  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.domain.socket.path = 
        2015-06-05 22:49:24,550 DEBUG Timer-3  org.apache.hadoop.hdfs.DFSClient: Sets dfs.client.block.write.replace-datanode-on-failure.replication to 0
        2015-06-05 22:49:24,552 DEBUG Timer-3  org.apache.hadoop.hdfs.HAUtil: No HA service delegation token found for logical URI hdfs://hacluster
        2015-06-05 22:49:24,552 DEBUG Timer-3  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.use.legacy.blockreader.local = false
        2015-06-05 22:49:24,552 DEBUG Timer-3  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.read.shortcircuit = false
        2015-06-05 22:49:24,552 DEBUG Timer-3  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.domain.socket.data.traffic = false
        2015-06-05 22:49:24,552 DEBUG Timer-3  org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.domain.socket.path = 
        2015-06-05 22:49:24,552 DEBUG Timer-3  org.apache.hadoop.io.retry.RetryUtils: multipleLinearRandomRetry = null
        2015-06-05 22:49:24,553 DEBUG Timer-3  org.apache.hadoop.ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@28194a50
        2015-06-05 22:49:24,554 DEBUG Timer-3  org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil: DataTransferProtocol using SaslPropertiesResolver, configured QOP dfs.data.transfer.protection = authentication, configured class dfs.data.transfer.saslproperties.resolver.class = class org.apache.hadoop.security.SaslPropertiesResolver
        2015-06-05 22:49:24,554 DEBUG Timer-3  org.apache.hadoop.ipc.Client: The ping interval is 60000 ms.
        2015-06-05 22:49:24,554 DEBUG Timer-3  org.apache.hadoop.ipc.Client: Connecting to /10.19.92.88:65110
        2015-06-05 22:49:24,555 DEBUG Timer-3  org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:749)
        2015-06-05 22:49:24,557 DEBUG Timer-3  org.apache.hadoop.security.SaslRpcClient: Get kerberos info proto:interface org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB info:@org.apache.hadoop.security.KerberosInfo(clientPrincipal=, serverPrincipal=dfs.namenode.kerberos.principal)
        2015-06-05 22:49:24,557 DEBUG Timer-3  org.apache.hadoop.security.SaslRpcClient: getting serverKey: dfs.namenode.kerberos.principal conf value: hdfs/huawei@HADOOP.COM principal: hdfs/huawei@HADOOP.COM
        2015-06-05 22:49:24,557 DEBUG Timer-3  org.apache.hadoop.security.SaslRpcClient: RPC Server's Kerberos principal name for protocol=org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB is hdfs/huawei@HADOOP.COM
        2015-06-05 22:49:24,557 DEBUG Timer-3  org.apache.hadoop.security.SaslRpcClient: Creating SASL GSSAPI(KERBEROS)  client to authenticate to service at huawei
        2015-06-05 22:49:24,558 DEBUG Timer-3  org.apache.hadoop.security.SaslRpcClient: Use KERBEROS authentication for protocol ClientNamenodeProtocolPB
        2015-06-05 22:49:24,559 DEBUG Timer-3  org.apache.hadoop.security.UserGroupInformation: PrivilegedActionException as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
        2015-06-05 22:49:24,560 DEBUG Timer-3  org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:668)
        2015-06-05 22:49:24,561 WARN Timer-3  org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
        2015-06-05 22:49:24,562 DEBUG Timer-3  org.apache.hadoop.ipc.Client: closing ipc connection to /10.19.92.88:65110: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
        java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
        	at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:709)
        	at java.security.AccessController.doPrivileged(Native Method)
        .......
        
        2015-06-05 22:49:24,562 DEBUG Timer-3  org.apache.hadoop.ipc.Client: IPC Client (1125964210) connection to /10.19.92.88:65110 from hdfs/huawei@HADOOP.COM: closed
        2015-06-05 22:49:24,567 INFO Timer-3  org.apache.hadoop.io.retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB over /10.19.92.88:65110. Trying to fail over immediately.
        java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "HOST-10-19-92-82/10.19.92.82"; destination host is: "host-10-19-92-88":65110; 
        	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776)
        	at org.apache.hadoop.ipc.Client.call(Client.java:1516)
        	at org.apache.hadoop.ipc.Client.call(Client.java:1443)
        ...........
        2015-06-05 22:49:24,568 DEBUG Timer-3  org.apache.hadoop.io.retry.RetryUtils: multipleLinearRandomRetry = null
        2015-06-05 22:49:24,569 DEBUG Timer-3  org.apache.hadoop.ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@28194a50
        2015-06-05 22:49:24,569 DEBUG Timer-3  org.apache.hadoop.ipc.Client: The ping interval is 60000 ms.
        2015-06-05 22:49:24,569 DEBUG Timer-3  org.apache.hadoop.ipc.Client: Connecting to /10.19.92.95:65110
        2015-06-05 22:49:24,574 DEBUG Timer-3  org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:749)
        2015-06-05 22:49:24,577 DEBUG Timer-3  org.apache.hadoop.security.SaslRpcClient: Get kerberos info proto:interface org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB info:@org.apache.hadoop.security.KerberosInfo(clientPrincipal=, serverPrincipal=dfs.namenode.kerberos.principal)
        2015-06-05 22:49:24,577 DEBUG Timer-3  org.apache.hadoop.security.SaslRpcClient: getting serverKey: dfs.namenode.kerberos.principal conf value: hdfs/huawei@HADOOP.COM principal: hdfs/huawei@HADOOP.COM
        2015-06-05 22:49:24,578 DEBUG Timer-3  org.apache.hadoop.security.SaslRpcClient: RPC Server's Kerberos principal name for protocol=org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB is hdfs/huawei@HADOOP.COM
        2015-06-05 22:49:24,578 DEBUG Timer-3  org.apache.hadoop.security.SaslRpcClient: Creating SASL GSSAPI(KERBEROS)  client to authenticate to service at huawei
        2015-06-05 22:49:24,579 DEBUG Timer-3  org.apache.hadoop.security.SaslRpcClient: Use KERBEROS authentication for protocol ClientNamenodeProtocolPB
        2015-06-05 22:49:24,580 DEBUG Timer-3  org.apache.hadoop.security.UserGroupInformation: PrivilegedActionException as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
        2015-06-05 22:49:24,585 DEBUG Timer-3  org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:668)
        2015-06-05 22:49:24,585 WARN Timer-3  org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
        2015-06-05 22:49:24,585 DEBUG Timer-3  org.apache.hadoop.security.UserGroupInformation: PrivilegedActionException as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
        .....
        ....(several similar logs)
        
        2015-06-05 22:49:24,699 ERROR Timer-3  org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: Error reading root log dir this deletion attempt is being aborted
        java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "HOST-10-19-92-82/10.19.92.82"; destination host is: "host-10-19-92-95":65110; 
        	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776)
        	at org.apache.hadoop.ipc.Client.call(Client.java:1516)
        	at org.apache.hadoop.ipc.Client.call(Client.java:1443)
        	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
        ....
        
        2015-06-05 22:49:24,699 INFO Timer-3  org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated log deletion finished.
        
        Show
        varun_saxena Varun Saxena added a comment - Sorry the correct sequence of error logs is as under. After first GSSException, client i.e. historyserver keeps on retrying before giving up. 2015-06-05 22:49:24,541 INFO Timer-3 org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated log deletion started. 2015-06-05 22:49:24,541 INFO IPC Server handler 0 on 10033 org.apache.hadoop.mapreduce.v2.hs.HSAuditLogger: USER=hdfs IP=10.19.92.82 OPERATION=refreshLogRetentionSettings TARGET=HSAdminServer RESULT=SUCCESS 2015-06-05 22:49:24,550 DEBUG Timer-3 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.use.legacy.blockreader.local = false 2015-06-05 22:49:24,550 DEBUG Timer-3 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.read.shortcircuit = false 2015-06-05 22:49:24,550 DEBUG Timer-3 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.domain.socket.data.traffic = false 2015-06-05 22:49:24,550 DEBUG Timer-3 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.domain.socket.path = 2015-06-05 22:49:24,550 DEBUG Timer-3 org.apache.hadoop.hdfs.DFSClient: Sets dfs.client.block.write.replace-datanode-on-failure.replication to 0 2015-06-05 22:49:24,552 DEBUG Timer-3 org.apache.hadoop.hdfs.HAUtil: No HA service delegation token found for logical URI hdfs://hacluster 2015-06-05 22:49:24,552 DEBUG Timer-3 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.use.legacy.blockreader.local = false 2015-06-05 22:49:24,552 DEBUG Timer-3 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.read.shortcircuit = false 2015-06-05 22:49:24,552 DEBUG Timer-3 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.client.domain.socket.data.traffic = false 2015-06-05 22:49:24,552 DEBUG Timer-3 org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: dfs.domain.socket.path = 2015-06-05 22:49:24,552 DEBUG Timer-3 org.apache.hadoop.io.retry.RetryUtils: multipleLinearRandomRetry = null 2015-06-05 22:49:24,553 DEBUG Timer-3 org.apache.hadoop.ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@28194a50 2015-06-05 22:49:24,554 DEBUG Timer-3 org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil: DataTransferProtocol using SaslPropertiesResolver, configured QOP dfs.data.transfer.protection = authentication, configured class dfs.data.transfer.saslproperties.resolver.class = class org.apache.hadoop.security.SaslPropertiesResolver 2015-06-05 22:49:24,554 DEBUG Timer-3 org.apache.hadoop.ipc.Client: The ping interval is 60000 ms. 2015-06-05 22:49:24,554 DEBUG Timer-3 org.apache.hadoop.ipc.Client: Connecting to /10.19.92.88:65110 2015-06-05 22:49:24,555 DEBUG Timer-3 org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:749) 2015-06-05 22:49:24,557 DEBUG Timer-3 org.apache.hadoop.security.SaslRpcClient: Get kerberos info proto:interface org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB info:@org.apache.hadoop.security.KerberosInfo(clientPrincipal=, serverPrincipal=dfs.namenode.kerberos.principal) 2015-06-05 22:49:24,557 DEBUG Timer-3 org.apache.hadoop.security.SaslRpcClient: getting serverKey: dfs.namenode.kerberos.principal conf value: hdfs/huawei@HADOOP.COM principal: hdfs/huawei@HADOOP.COM 2015-06-05 22:49:24,557 DEBUG Timer-3 org.apache.hadoop.security.SaslRpcClient: RPC Server's Kerberos principal name for protocol=org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB is hdfs/huawei@HADOOP.COM 2015-06-05 22:49:24,557 DEBUG Timer-3 org.apache.hadoop.security.SaslRpcClient: Creating SASL GSSAPI(KERBEROS) client to authenticate to service at huawei 2015-06-05 22:49:24,558 DEBUG Timer-3 org.apache.hadoop.security.SaslRpcClient: Use KERBEROS authentication for protocol ClientNamenodeProtocolPB 2015-06-05 22:49:24,559 DEBUG Timer-3 org.apache.hadoop.security.UserGroupInformation: PrivilegedActionException as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] 2015-06-05 22:49:24,560 DEBUG Timer-3 org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:668) 2015-06-05 22:49:24,561 WARN Timer-3 org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] 2015-06-05 22:49:24,562 DEBUG Timer-3 org.apache.hadoop.ipc.Client: closing ipc connection to /10.19.92.88:65110: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:709) at java.security.AccessController.doPrivileged(Native Method) ....... 2015-06-05 22:49:24,562 DEBUG Timer-3 org.apache.hadoop.ipc.Client: IPC Client (1125964210) connection to /10.19.92.88:65110 from hdfs/huawei@HADOOP.COM: closed 2015-06-05 22:49:24,567 INFO Timer-3 org.apache.hadoop.io.retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB over /10.19.92.88:65110. Trying to fail over immediately. java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "HOST-10-19-92-82/10.19.92.82"; destination host is: "host-10-19-92-88":65110; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776) at org.apache.hadoop.ipc.Client.call(Client.java:1516) at org.apache.hadoop.ipc.Client.call(Client.java:1443) ........... 2015-06-05 22:49:24,568 DEBUG Timer-3 org.apache.hadoop.io.retry.RetryUtils: multipleLinearRandomRetry = null 2015-06-05 22:49:24,569 DEBUG Timer-3 org.apache.hadoop.ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@28194a50 2015-06-05 22:49:24,569 DEBUG Timer-3 org.apache.hadoop.ipc.Client: The ping interval is 60000 ms. 2015-06-05 22:49:24,569 DEBUG Timer-3 org.apache.hadoop.ipc.Client: Connecting to /10.19.92.95:65110 2015-06-05 22:49:24,574 DEBUG Timer-3 org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:749) 2015-06-05 22:49:24,577 DEBUG Timer-3 org.apache.hadoop.security.SaslRpcClient: Get kerberos info proto:interface org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB info:@org.apache.hadoop.security.KerberosInfo(clientPrincipal=, serverPrincipal=dfs.namenode.kerberos.principal) 2015-06-05 22:49:24,577 DEBUG Timer-3 org.apache.hadoop.security.SaslRpcClient: getting serverKey: dfs.namenode.kerberos.principal conf value: hdfs/huawei@HADOOP.COM principal: hdfs/huawei@HADOOP.COM 2015-06-05 22:49:24,578 DEBUG Timer-3 org.apache.hadoop.security.SaslRpcClient: RPC Server's Kerberos principal name for protocol=org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB is hdfs/huawei@HADOOP.COM 2015-06-05 22:49:24,578 DEBUG Timer-3 org.apache.hadoop.security.SaslRpcClient: Creating SASL GSSAPI(KERBEROS) client to authenticate to service at huawei 2015-06-05 22:49:24,579 DEBUG Timer-3 org.apache.hadoop.security.SaslRpcClient: Use KERBEROS authentication for protocol ClientNamenodeProtocolPB 2015-06-05 22:49:24,580 DEBUG Timer-3 org.apache.hadoop.security.UserGroupInformation: PrivilegedActionException as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] 2015-06-05 22:49:24,585 DEBUG Timer-3 org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:668) 2015-06-05 22:49:24,585 WARN Timer-3 org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] 2015-06-05 22:49:24,585 DEBUG Timer-3 org.apache.hadoop.security.UserGroupInformation: PrivilegedActionException as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] ..... ....(several similar logs) 2015-06-05 22:49:24,699 ERROR Timer-3 org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: Error reading root log dir this deletion attempt is being aborted java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "HOST-10-19-92-82/10.19.92.82"; destination host is: "host-10-19-92-95":65110; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776) at org.apache.hadoop.ipc.Client.call(Client.java:1516) at org.apache.hadoop.ipc.Client.call(Client.java:1443) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) .... 2015-06-05 22:49:24,699 INFO Timer-3 org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated log deletion finished.
        Hide
        xgong Xuan Gong added a comment -

        Varun Saxena Thanks for the logs. Could you apply the patch and print the ugi ?

        Show
        xgong Xuan Gong added a comment - Varun Saxena Thanks for the logs. Could you apply the patch and print the ugi ?
        Hide
        varun_saxena Varun Saxena added a comment -

        Sure. Will share DEBUG logs for that too.

        Show
        varun_saxena Varun Saxena added a comment - Sure. Will share DEBUG logs for that too.
        Hide
        varun_saxena Varun Saxena added a comment -

        Xuan Gong, after applying the patch, debug log on refreshing log retention setting is something as under. I will update both success and error logs too, a little while later.

        2015-06-11 14:49:56,973 DEBUG org.apache.hadoop.ipc.Server: Socket Reader #1 for port 10033: responding to null from 10.19.92.82:30295 Call#-33 Retry#-1 Wrote 22 bytes.
        2015-06-11 14:49:56,981 DEBUG org.apache.hadoop.ipc.Server:  got #-3
        2015-06-11 14:49:57,014 DEBUG org.apache.hadoop.ipc.Server: Successfully authorized userInfo {
          effectiveUser: "hdfs/huawei@HADOOP.COM"
        }
        protocol: "org.apache.hadoop.mapreduce.v2.api.HSAdminRefreshProtocol"
        
        2015-06-11 14:49:57,014 DEBUG org.apache.hadoop.ipc.Server:  got #0
        2015-06-11 14:49:57,015 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 0 on 10033: org.apache.hadoop.mapreduce.v2.api.HSAdminRefreshProtocol.refreshLogRetentionSettings from 10.19.92.82:30295 Call#0 Retry#0 for RpcKind RPC_PROTOCOL_BUFFER
        2015-06-11 14:49:57,016 DEBUG org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) from:org.apache.hadoop.ipc.Server$Handler.run(Server.java:2082)
        2015-06-11 14:49:57,027 INFO org.apache.hadoop.mapreduce.v2.hs.server.HSAdminServer: HS Admin: refreshLogRetentionSettings invoked by user hdfs
        2015-06-11 14:49:57,027 DEBUG org.apache.hadoop.ipc.Client: stopping client from cache: org.apache.hadoop.ipc.Client@2dfaea86
        2015-06-11 14:49:57,079 DEBUG org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) from:org.apache.hadoop.yarn.client.RMProxy.getProxy(RMProxy.java:136)
        2015-06-11 14:49:57,079 DEBUG org.apache.hadoop.yarn.ipc.YarnRPC: Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
        2015-06-11 14:49:57,079 DEBUG org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC: Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.yarn.api.ApplicationClientProtocol
        2015-06-11 14:49:57,080 DEBUG org.apache.hadoop.ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@2dfaea86
        2015-06-11 14:49:57,081 INFO org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated log deletion started.
        2015-06-11 14:49:57,081 INFO org.apache.hadoop.mapreduce.v2.hs.HSAuditLogger: USER=hdfs IP=10.19.92.82  OPERATION=refreshLogRetentionSettings   TARGET=HSAdminServer    RESULT=SUCCESS
        2015-06-11 14:49:57,081 DEBUG org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) from:org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:83)
        2015-06-11 14:49:57,081 DEBUG org.apache.hadoop.ipc.Server: Served: refreshLogRetentionSettings queueTime= 11 procesingTime= 55
        2015-06-11 14:49:57,082 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 0 on 10033: responding to org.apache.hadoop.mapreduce.v2.api.HSAdminRefreshProtocol.refreshLogRetentionSettings from 10.19.92.82:30295 Call#0 Retry#0
        2015-06-11 14:49:57,083 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 0 on 10033: responding to org.apache.hadoop.mapreduce.v2.api.HSAdminRefreshProtocol.refreshLogRetentionSettings from 10.19.92.82:30295 Call#0 Retry#0 Wrote 32 bytes.
        2015-06-11 14:49:57,083 DEBUG org.apache.hadoop.ipc.Client: IPC Client (889891977) connection to /10.19.92.82:65110 from hdfs/huawei@HADOOP.COM sending #5
        2015-06-11 14:49:57,084 DEBUG org.apache.hadoop.ipc.Client: IPC Client (889891977) connection to /10.19.92.82:65110 from hdfs/huawei@HADOOP.COM got value #5
        2015-06-11 14:49:57,084 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: getListing took 1ms
        2015-06-11 14:49:57,085 INFO org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated log deletion finished.
        
        Show
        varun_saxena Varun Saxena added a comment - Xuan Gong , after applying the patch, debug log on refreshing log retention setting is something as under. I will update both success and error logs too, a little while later. 2015-06-11 14:49:56,973 DEBUG org.apache.hadoop.ipc.Server: Socket Reader #1 for port 10033: responding to null from 10.19.92.82:30295 Call#-33 Retry#-1 Wrote 22 bytes. 2015-06-11 14:49:56,981 DEBUG org.apache.hadoop.ipc.Server: got #-3 2015-06-11 14:49:57,014 DEBUG org.apache.hadoop.ipc.Server: Successfully authorized userInfo { effectiveUser: "hdfs/huawei@HADOOP.COM" } protocol: "org.apache.hadoop.mapreduce.v2.api.HSAdminRefreshProtocol" 2015-06-11 14:49:57,014 DEBUG org.apache.hadoop.ipc.Server: got #0 2015-06-11 14:49:57,015 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 0 on 10033: org.apache.hadoop.mapreduce.v2.api.HSAdminRefreshProtocol.refreshLogRetentionSettings from 10.19.92.82:30295 Call#0 Retry#0 for RpcKind RPC_PROTOCOL_BUFFER 2015-06-11 14:49:57,016 DEBUG org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) from:org.apache.hadoop.ipc.Server$Handler.run(Server.java:2082) 2015-06-11 14:49:57,027 INFO org.apache.hadoop.mapreduce.v2.hs.server.HSAdminServer: HS Admin: refreshLogRetentionSettings invoked by user hdfs 2015-06-11 14:49:57,027 DEBUG org.apache.hadoop.ipc.Client: stopping client from cache: org.apache.hadoop.ipc.Client@2dfaea86 2015-06-11 14:49:57,079 DEBUG org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) from:org.apache.hadoop.yarn.client.RMProxy.getProxy(RMProxy.java:136) 2015-06-11 14:49:57,079 DEBUG org.apache.hadoop.yarn.ipc.YarnRPC: Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC 2015-06-11 14:49:57,079 DEBUG org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC: Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.yarn.api.ApplicationClientProtocol 2015-06-11 14:49:57,080 DEBUG org.apache.hadoop.ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@2dfaea86 2015-06-11 14:49:57,081 INFO org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated log deletion started. 2015-06-11 14:49:57,081 INFO org.apache.hadoop.mapreduce.v2.hs.HSAuditLogger: USER=hdfs IP=10.19.92.82 OPERATION=refreshLogRetentionSettings TARGET=HSAdminServer RESULT=SUCCESS 2015-06-11 14:49:57,081 DEBUG org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/huawei@HADOOP.COM (auth:KERBEROS) from:org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:83) 2015-06-11 14:49:57,081 DEBUG org.apache.hadoop.ipc.Server: Served: refreshLogRetentionSettings queueTime= 11 procesingTime= 55 2015-06-11 14:49:57,082 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 0 on 10033: responding to org.apache.hadoop.mapreduce.v2.api.HSAdminRefreshProtocol.refreshLogRetentionSettings from 10.19.92.82:30295 Call#0 Retry#0 2015-06-11 14:49:57,083 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 0 on 10033: responding to org.apache.hadoop.mapreduce.v2.api.HSAdminRefreshProtocol.refreshLogRetentionSettings from 10.19.92.82:30295 Call#0 Retry#0 Wrote 32 bytes. 2015-06-11 14:49:57,083 DEBUG org.apache.hadoop.ipc.Client: IPC Client (889891977) connection to /10.19.92.82:65110 from hdfs/huawei@HADOOP.COM sending #5 2015-06-11 14:49:57,084 DEBUG org.apache.hadoop.ipc.Client: IPC Client (889891977) connection to /10.19.92.82:65110 from hdfs/huawei@HADOOP.COM got value #5 2015-06-11 14:49:57,084 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: getListing took 1ms 2015-06-11 14:49:57,085 INFO org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated log deletion finished.
        Hide
        varun_saxena Varun Saxena added a comment -

        Xuan Gong, also updated complete logs, one for demonstrating the problem and other demonstrating the fix(after patch above has been applied). Moreover, this issue can be fixed if I use ScheduledThreadPoolExecutor with one thread(which is anyways recommended for use over Timer) but as that fix wasn't directly related to the issue, hence didnt submit that as a solution.

        Show
        varun_saxena Varun Saxena added a comment - Xuan Gong , also updated complete logs, one for demonstrating the problem and other demonstrating the fix(after patch above has been applied). Moreover, this issue can be fixed if I use ScheduledThreadPoolExecutor with one thread(which is anyways recommended for use over Timer) but as that fix wasn't directly related to the issue, hence didnt submit that as a solution.
        Hide
        varun_saxena Varun Saxena added a comment -

        By updated I mean attached.

        Show
        varun_saxena Varun Saxena added a comment - By updated I mean attached.
        Hide
        zjshen Zhijie Shen added a comment -

        Varun Saxena, do you know why ugi is still the same, but kerberos authentication gets failed?

        Show
        zjshen Zhijie Shen added a comment - Varun Saxena , do you know why ugi is still the same, but kerberos authentication gets failed?
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Went through the ticket.

        Figured out why you needed to do this: The incoming UGI from the RPC layer doesn't have the kerberos credentials. So if the user-name is the same, the remote UGI cannot talk to HDFS over kerberos authentication.

        Comments on the patch

        • Let's call the UGI as login-UGI and use the UserGroupInformation.getLoginUser() call. In fact, you should simply copy the usage of ResourceManager.rmLoginUGI - Point this to UserGroupInformation.getCurrentUser() if security is not enabled, otherwise point it to UserGroupInformation.getLoginUser().
        • Also, usually in other services, we do login in serviceStart(). So you may want to move the initialization of the UGI to beginning of serviceStart.

        I think the same bug may happen in refreshJobRetentionSettings() of Jobhistory. Varun Saxena, can you please give it a try and verify that? We may have to file a MR ticket for this.

        IAC, I think these issues have been around for a while. If we can get these fixes in and get them verified this week, I'll include them in 2.7.1 otherwise, we can move them to 2.7.2.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Went through the ticket. Figured out why you needed to do this: The incoming UGI from the RPC layer doesn't have the kerberos credentials. So if the user-name is the same, the remote UGI cannot talk to HDFS over kerberos authentication. Comments on the patch Let's call the UGI as login-UGI and use the UserGroupInformation.getLoginUser() call. In fact, you should simply copy the usage of ResourceManager.rmLoginUGI - Point this to UserGroupInformation.getCurrentUser() if security is not enabled, otherwise point it to UserGroupInformation.getLoginUser(). Also, usually in other services, we do login in serviceStart(). So you may want to move the initialization of the UGI to beginning of serviceStart. I think the same bug may happen in refreshJobRetentionSettings() of Jobhistory. Varun Saxena , can you please give it a try and verify that? We may have to file a MR ticket for this. IAC, I think these issues have been around for a while. If we can get these fixes in and get them verified this week, I'll include them in 2.7.1 otherwise, we can move them to 2.7.2.
        Hide
        zjshen Zhijie Shen added a comment -

        Thanks for helping the issue, Vinod! It sounds the right cause of this issue. I checked refreshJobRetentionSettings, which should have the same problem because of accessing HDFS too.

        I'm thinking it is more clear to fix the problem inside HSAdminServer. We still need to cache the correct loginUGI. Then, inside HSAdminServer, once we verified user's permission on a certain command, we use loginUGI to complete the following process instead of the remote user. Thoughts?

        Show
        zjshen Zhijie Shen added a comment - Thanks for helping the issue, Vinod! It sounds the right cause of this issue. I checked refreshJobRetentionSettings, which should have the same problem because of accessing HDFS too. I'm thinking it is more clear to fix the problem inside HSAdminServer. We still need to cache the correct loginUGI. Then, inside HSAdminServer, once we verified user's permission on a certain command, we use loginUGI to complete the following process instead of the remote user. Thoughts?
        Hide
        varun_saxena Varun Saxena added a comment -

        Vinod Kumar Vavilapalli, Zhijie Shen,
        I had checked refreshJobRetentionSettings too when this issue came. And issue didn't happen there.
        This issue comes in the case of refreshLogRetentionSettings as a new thread is invoked(upon cancellation of Timer) which creates a new DFS Client to connect to namenode.

        In case of refresh Job retention settings, we use a ScheduledThreadPoolExecutor instead hence a new thread is not spawned on refresh. We simply cancel the ScheduledFuture. And in this case, issue doesn't happen.

        Show
        varun_saxena Varun Saxena added a comment - Vinod Kumar Vavilapalli , Zhijie Shen , I had checked refreshJobRetentionSettings too when this issue came. And issue didn't happen there. This issue comes in the case of refreshLogRetentionSettings as a new thread is invoked(upon cancellation of Timer ) which creates a new DFS Client to connect to namenode. In case of refresh Job retention settings, we use a ScheduledThreadPoolExecutor instead hence a new thread is not spawned on refresh. We simply cancel the ScheduledFuture . And in this case, issue doesn't happen.
        Hide
        varun_saxena Varun Saxena added a comment -

        Will update the patch as per suggestions tomorrow morning.

        Show
        varun_saxena Varun Saxena added a comment - Will update the patch as per suggestions tomorrow morning.
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Varun Saxena, I agree with Zhijie here. We may be lucky for now in case of refreshJobRention call depending on how we spawn threads. To future proof ourselves, I think the right behaviour is to simply depend on loginUser in both the cases.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Varun Saxena , I agree with Zhijie here. We may be lucky for now in case of refreshJobRention call depending on how we spawn threads. To future proof ourselves, I think the right behaviour is to simply depend on loginUser in both the cases.
        Hide
        varun_saxena Varun Saxena added a comment -

        Vinod Kumar Vavilapalli, thats correct.
        So do you want me to raise another JIRA for that ? Or do it as part of this one only ?

        Show
        varun_saxena Varun Saxena added a comment - Vinod Kumar Vavilapalli , thats correct. So do you want me to raise another JIRA for that ? Or do it as part of this one only ?
        Hide
        varun_saxena Varun Saxena added a comment -

        Added a patch and submitted it, fixing both cases. This JIRA should move to MAPREDUCE. But not moving it because not sure if Jenkins will be able to post results of the submitted patch then

        Show
        varun_saxena Varun Saxena added a comment - Added a patch and submitted it, fixing both cases. This JIRA should move to MAPREDUCE. But not moving it because not sure if Jenkins will be able to post results of the submitted patch then
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 15m 56s Pre-patch trunk compilation is healthy.
        +1 @author 0m 0s The patch does not contain any @author tags.
        -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
        +1 javac 7m 46s There were no new javac warning messages.
        +1 javadoc 9m 52s There were no new javadoc warning messages.
        +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
        +1 checkstyle 0m 28s There were no new checkstyle issues.
        +1 whitespace 0m 0s The patch has no lines that end in whitespace.
        +1 install 1m 35s mvn install still works.
        +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
        +1 findbugs 0m 55s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
        +1 mapreduce tests 5m 53s Tests passed in hadoop-mapreduce-client-hs.
            43m 25s  



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12740836/YARN-3779.03.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / 055cd5a
        hadoop-mapreduce-client-hs test log https://builds.apache.org/job/PreCommit-YARN-Build/8301/artifact/patchprocess/testrun_hadoop-mapreduce-client-hs.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8301/testReport/
        Java 1.7.0_55
        uname Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/8301/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 15m 56s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac 7m 46s There were no new javac warning messages. +1 javadoc 9m 52s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 28s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 35s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 0m 55s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 mapreduce tests 5m 53s Tests passed in hadoop-mapreduce-client-hs.     43m 25s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12740836/YARN-3779.03.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 055cd5a hadoop-mapreduce-client-hs test log https://builds.apache.org/job/PreCommit-YARN-Build/8301/artifact/patchprocess/testrun_hadoop-mapreduce-client-hs.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8301/testReport/ Java 1.7.0_55 uname Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8301/console This message was automatically generated.
        Hide
        varun_saxena Varun Saxena added a comment -

        Moved this to MAPREDUCE as code change is in MAPREDUCE code

        Show
        varun_saxena Varun Saxena added a comment - Moved this to MAPREDUCE as code change is in MAPREDUCE code
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 19m 50s Pre-patch trunk compilation is healthy.
        +1 @author 0m 0s The patch does not contain any @author tags.
        -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
        +1 javac 10m 28s There were no new javac warning messages.
        +1 javadoc 11m 23s There were no new javadoc warning messages.
        +1 release audit 0m 27s The applied patch does not increase the total number of release audit warnings.
        +1 checkstyle 0m 33s There were no new checkstyle issues.
        +1 whitespace 0m 0s The patch has no lines that end in whitespace.
        +1 install 1m 46s mvn install still works.
        +1 eclipse:eclipse 0m 41s The patch built with eclipse:eclipse.
        +1 findbugs 1m 9s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
        +1 mapreduce tests 6m 26s Tests passed in hadoop-mapreduce-client-hs.
            52m 47s  



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12740836/YARN-3779.03.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / c7d022b
        hadoop-mapreduce-client-hs test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5820/artifact/patchprocess/testrun_hadoop-mapreduce-client-hs.txt
        Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5820/testReport/
        Java 1.7.0_55
        uname Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5820/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 19m 50s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac 10m 28s There were no new javac warning messages. +1 javadoc 11m 23s There were no new javadoc warning messages. +1 release audit 0m 27s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 33s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 46s mvn install still works. +1 eclipse:eclipse 0m 41s The patch built with eclipse:eclipse. +1 findbugs 1m 9s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 mapreduce tests 6m 26s Tests passed in hadoop-mapreduce-client-hs.     52m 47s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12740836/YARN-3779.03.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / c7d022b hadoop-mapreduce-client-hs test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5820/artifact/patchprocess/testrun_hadoop-mapreduce-client-hs.txt Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5820/testReport/ Java 1.7.0_55 uname Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5820/console This message was automatically generated.
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Varun Saxena, can you add a test in TestHSAdminServer? You can do a refresh as a different user and validate that the daemon-user does the real refresh?

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Varun Saxena , can you add a test in TestHSAdminServer? You can do a refresh as a different user and validate that the daemon-user does the real refresh?
        Hide
        varun_saxena Varun Saxena added a comment -

        Vinod Kumar Vavilapalli, sure. Will do so.

        Show
        varun_saxena Varun Saxena added a comment - Vinod Kumar Vavilapalli , sure. Will do so.
        Hide
        varun_saxena Varun Saxena added a comment -
        Show
        varun_saxena Varun Saxena added a comment - Vinod Kumar Vavilapalli , added test.
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 15m 28s Pre-patch trunk compilation is healthy.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
        +1 javac 7m 54s There were no new javac warning messages.
        +1 javadoc 9m 59s There were no new javadoc warning messages.
        +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
        +1 checkstyle 0m 30s There were no new checkstyle issues.
        -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix.
        +1 install 1m 35s mvn install still works.
        +1 eclipse:eclipse 0m 32s The patch built with eclipse:eclipse.
        +1 findbugs 0m 53s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
        +1 mapreduce tests 6m 4s Tests passed in hadoop-mapreduce-client-hs.
            43m 20s  



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12741120/MAPREDUCE-6410.04.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / 445b132
        whitespace https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5829/artifact/patchprocess/whitespace.txt
        hadoop-mapreduce-client-hs test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5829/artifact/patchprocess/testrun_hadoop-mapreduce-client-hs.txt
        Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5829/testReport/
        Java 1.7.0_55
        uname Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5829/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 15m 28s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 54s There were no new javac warning messages. +1 javadoc 9m 59s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 30s There were no new checkstyle issues. -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 35s mvn install still works. +1 eclipse:eclipse 0m 32s The patch built with eclipse:eclipse. +1 findbugs 0m 53s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 mapreduce tests 6m 4s Tests passed in hadoop-mapreduce-client-hs.     43m 20s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12741120/MAPREDUCE-6410.04.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 445b132 whitespace https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5829/artifact/patchprocess/whitespace.txt hadoop-mapreduce-client-hs test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5829/artifact/patchprocess/testrun_hadoop-mapreduce-client-hs.txt Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5829/testReport/ Java 1.7.0_55 uname Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5829/console This message was automatically generated.
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Looks good. Fixing the white-space issue myself..

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Looks good. Fixing the white-space issue myself..
        Hide
        hadoopqa Hadoop QA added a comment -



        +1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 15m 36s Pre-patch trunk compilation is healthy.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
        +1 javac 7m 30s There were no new javac warning messages.
        +1 javadoc 9m 42s There were no new javadoc warning messages.
        +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
        +1 checkstyle 0m 27s There were no new checkstyle issues.
        +1 whitespace 0m 0s The patch has no lines that end in whitespace.
        +1 install 1m 34s mvn install still works.
        +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
        +1 findbugs 0m 50s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
        +1 mapreduce tests 5m 53s Tests passed in hadoop-mapreduce-client-hs.
            42m 34s  



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12741170/MAPREDUCE-6410.05.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / fac4e04
        hadoop-mapreduce-client-hs test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5832/artifact/patchprocess/testrun_hadoop-mapreduce-client-hs.txt
        Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5832/testReport/
        Java 1.7.0_55
        uname Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5832/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 15m 36s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 30s There were no new javac warning messages. +1 javadoc 9m 42s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 27s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 34s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 0m 50s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 mapreduce tests 5m 53s Tests passed in hadoop-mapreduce-client-hs.     42m 34s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12741170/MAPREDUCE-6410.05.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / fac4e04 hadoop-mapreduce-client-hs test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5832/artifact/patchprocess/testrun_hadoop-mapreduce-client-hs.txt Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5832/testReport/ Java 1.7.0_55 uname Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5832/console This message was automatically generated.
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Committed this to trunk, branch-2 and branch-2.7. Thanks Varun!

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Committed this to trunk, branch-2 and branch-2.7. Thanks Varun!
        Hide
        varun_saxena Varun Saxena added a comment -

        Thanks for the review and commit Vinod Kumar Vavilapalli

        Show
        varun_saxena Varun Saxena added a comment - Thanks for the review and commit Vinod Kumar Vavilapalli
        Hide
        varun_saxena Varun Saxena added a comment -

        Thanks Zhijie Shen and Xuan Gong for the review as well

        Show
        varun_saxena Varun Saxena added a comment - Thanks Zhijie Shen and Xuan Gong for the review as well
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk #968 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/968/)
        MAPREDUCE-6410. Fixed MapReduce JobHistory server to use the right (login) UGI to refresh log and cleaner settings. Contributed by Varun Saxena. (vinodkv: rev d481684c7c9293a94f54ef622a92753531c6acc7)

        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/server/TestHSAdminServer.java
        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/server/HSAdminServer.java
        • hadoop-mapreduce-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #968 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/968/ ) MAPREDUCE-6410 . Fixed MapReduce JobHistory server to use the right (login) UGI to refresh log and cleaner settings. Contributed by Varun Saxena. (vinodkv: rev d481684c7c9293a94f54ef622a92753531c6acc7) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/server/TestHSAdminServer.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/server/HSAdminServer.java hadoop-mapreduce-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #238 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/238/)
        MAPREDUCE-6410. Fixed MapReduce JobHistory server to use the right (login) UGI to refresh log and cleaner settings. Contributed by Varun Saxena. (vinodkv: rev d481684c7c9293a94f54ef622a92753531c6acc7)

        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/server/HSAdminServer.java
        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/server/TestHSAdminServer.java
        • hadoop-mapreduce-project/CHANGES.txt
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #238 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/238/ ) MAPREDUCE-6410 . Fixed MapReduce JobHistory server to use the right (login) UGI to refresh log and cleaner settings. Contributed by Varun Saxena. (vinodkv: rev d481684c7c9293a94f54ef622a92753531c6acc7) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/server/HSAdminServer.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/server/TestHSAdminServer.java hadoop-mapreduce-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #227 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/227/)
        MAPREDUCE-6410. Fixed MapReduce JobHistory server to use the right (login) UGI to refresh log and cleaner settings. Contributed by Varun Saxena. (vinodkv: rev d481684c7c9293a94f54ef622a92753531c6acc7)

        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/server/TestHSAdminServer.java
        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/server/HSAdminServer.java
        • hadoop-mapreduce-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #227 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/227/ ) MAPREDUCE-6410 . Fixed MapReduce JobHistory server to use the right (login) UGI to refresh log and cleaner settings. Contributed by Varun Saxena. (vinodkv: rev d481684c7c9293a94f54ef622a92753531c6acc7) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/server/TestHSAdminServer.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/server/HSAdminServer.java hadoop-mapreduce-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Hdfs-trunk #2166 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2166/)
        MAPREDUCE-6410. Fixed MapReduce JobHistory server to use the right (login) UGI to refresh log and cleaner settings. Contributed by Varun Saxena. (vinodkv: rev d481684c7c9293a94f54ef622a92753531c6acc7)

        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/server/HSAdminServer.java
        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/server/TestHSAdminServer.java
        • hadoop-mapreduce-project/CHANGES.txt
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Hdfs-trunk #2166 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2166/ ) MAPREDUCE-6410 . Fixed MapReduce JobHistory server to use the right (login) UGI to refresh log and cleaner settings. Contributed by Varun Saxena. (vinodkv: rev d481684c7c9293a94f54ef622a92753531c6acc7) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/server/HSAdminServer.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/server/TestHSAdminServer.java hadoop-mapreduce-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #236 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/236/)
        MAPREDUCE-6410. Fixed MapReduce JobHistory server to use the right (login) UGI to refresh log and cleaner settings. Contributed by Varun Saxena. (vinodkv: rev d481684c7c9293a94f54ef622a92753531c6acc7)

        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/server/TestHSAdminServer.java
        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/server/HSAdminServer.java
        • hadoop-mapreduce-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #236 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/236/ ) MAPREDUCE-6410 . Fixed MapReduce JobHistory server to use the right (login) UGI to refresh log and cleaner settings. Contributed by Varun Saxena. (vinodkv: rev d481684c7c9293a94f54ef622a92753531c6acc7) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/server/TestHSAdminServer.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/server/HSAdminServer.java hadoop-mapreduce-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk #2184 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2184/)
        MAPREDUCE-6410. Fixed MapReduce JobHistory server to use the right (login) UGI to refresh log and cleaner settings. Contributed by Varun Saxena. (vinodkv: rev d481684c7c9293a94f54ef622a92753531c6acc7)

        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/server/TestHSAdminServer.java
        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/server/HSAdminServer.java
        • hadoop-mapreduce-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2184 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2184/ ) MAPREDUCE-6410 . Fixed MapReduce JobHistory server to use the right (login) UGI to refresh log and cleaner settings. Contributed by Varun Saxena. (vinodkv: rev d481684c7c9293a94f54ef622a92753531c6acc7) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/server/TestHSAdminServer.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/server/HSAdminServer.java hadoop-mapreduce-project/CHANGES.txt

          People

          • Assignee:
            varun_saxena Varun Saxena
            Reporter:
            sijing0410 Zhang Wei
          • Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

            • Due:
              Created:
              Updated:
              Resolved:

              Development