Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6410

Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.7.0
    • Fix Version/s: 2.8.0, 2.7.1, 3.0.0-alpha1
    • Component/s: None
    • Labels:
    • Environment:

      mrV2, secure mode

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      GSSException is thrown everytime log aggregation deletion is attempted after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure cluster.

      The problem can be reproduced by following steps:
      1. startup historyserver in secure cluster.
      2. Log deletion happens as per expectation.
      3. execute mapred hsadmin -refreshLogRetentionSettings command to refresh the configuration value.
      4. All the subsequent attempts of log deletion fail with GSSException

      Following exception can be found in historyserver's log if log deletion is enabled.

      2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this deletion attempt is being aborted | AggregatedLogDeletionService.java:127
      java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; destination host is: "vm-33":25000; 
              at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
              at org.apache.hadoop.ipc.Client.call(Client.java:1414)
              at org.apache.hadoop.ipc.Client.call(Client.java:1363)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
              at com.sun.proxy.$Proxy9.getListing(Unknown Source)
              at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
              at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:606)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
              at com.sun.proxy.$Proxy10.getListing(Unknown Source)
              at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
              at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
              at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
              at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
              at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
              at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
              at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
              at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
              at org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
              at java.util.TimerThread.mainLoop(Timer.java:555)
              at java.util.TimerThread.run(Timer.java:505)
      Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
              at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
              at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640)
              at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:724)
              at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
              at org.apache.hadoop.ipc.Client.getConnection(Client.java:1462)
              at org.apache.hadoop.ipc.Client.call(Client.java:1381)
              ... 21 more
      Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
              at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
              at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:411)
              at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:550)
              at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:367)
              at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:716)
              at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:712)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
              at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:711)
              ... 24 more
      Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
              at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
              at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
              at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
              at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
              at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
              at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
              at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
              ... 33 more
      

        Attachments

        1. YARN-3779.03.patch
          3 kB
          Varun Saxena
        2. YARN-3779.02.patch
          5 kB
          Varun Saxena
        3. YARN-3779.01.patch
          5 kB
          Varun Saxena
        4. MAPREDUCE-6410.05.patch
          7 kB
          Vinod Kumar Vavilapalli
        5. MAPREDUCE-6410.04.patch
          7 kB
          Varun Saxena
        6. log_aggr_deletion_on_refresh_fix.log
          294 kB
          Varun Saxena
        7. log_aggr_deletion_on_refresh_error.log
          375 kB
          Varun Saxena

          Activity

            People

            • Assignee:
              varun_saxena Varun Saxena
              Reporter:
              sijing0410 Zhang Wei
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Due:
                Created:
                Updated:
                Resolved: