Uploaded image for project: 'Sqoop (Retired)'
  1. Sqoop (Retired)
  2. SQOOP-2709

Sqoop2: HDFS: Impersonation on secured cluster doesn't work

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.99.7
    • None
    • None

    Description

      Using HDFS connector on secured cluster currently doesn't work with following exception:

      2015-11-19 13:24:30,624 [OutputFormatLoader-consumer] ERROR org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor  - Error while loading data out of MR job.
      org.apache.sqoop.common.SqoopException: GENERIC_HDFS_CONNECTOR_0005:Error occurs during loader run
      	at org.apache.sqoop.connector.hdfs.HdfsLoader$1.run(HdfsLoader.java:119)
      	at org.apache.sqoop.connector.hdfs.HdfsLoader$1.run(HdfsLoader.java:60)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:415)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
      	at org.apache.sqoop.connector.hdfs.HdfsLoader.load(HdfsLoader.java:60)
      	at org.apache.sqoop.connector.hdfs.HdfsLoader.load(HdfsLoader.java:44)
      	at org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor$ConsumerThread.run(SqoopOutputFormatLoadExecutor.java:267)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "sqoopkrb-4.vpc.cloudera.com/172.28.211.196"; destination host is: "sqoopkrb-1.vpc.cloudera.com":8020; 
      	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
      	at org.apache.hadoop.ipc.Client.call(Client.java:1476)
      	at org.apache.hadoop.ipc.Client.call(Client.java:1403)
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
      	at com.sun.proxy.$Proxy15.create(Unknown Source)
      	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
      	at com.sun.proxy.$Proxy16.create(Unknown Source)
      	at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1867)
      	at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1737)
      	at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1662)
      	at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:404)
      	at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:400)
      	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
      	at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:400)
      	at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:343)
      	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:917)
      	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:898)
      	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:795)
      	at org.apache.sqoop.connector.hdfs.hdfsWriter.HdfsTextWriter.initialize(HdfsTextWriter.java:40)
      	at org.apache.sqoop.connector.hdfs.HdfsLoader$1.run(HdfsLoader.java:93)
      	... 12 more
      Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
      	at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:682)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:415)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
      	at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:645)
      	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:733)
      	at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
      	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1525)
      	at org.apache.hadoop.ipc.Client.call(Client.java:1442)
      	... 36 more
      Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
      	at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
      	at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:413)
      	at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:555)
      	at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:370)
      	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:725)
      	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:721)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:415)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
      	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:720)
      	... 39 more
      Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
      	at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
      	at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
      	at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
      	at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
      	at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
      	at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
      	at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
      	... 48 more
      

      It's very long exception, but the gist of it is here:

      Host Details : local host is: "sqoopkrb-4.vpc.cloudera.com/172.28.211.196"; destination host is: "sqoopkrb-1.vpc.cloudera.com":8020;
      

      We've triaged it with abrahamfine to the fact that we're doing the impersonation exactly the same way on the Sqoop 2 server side and as the mapper side. However on mapper side we no longer have kerberos ticket - we have only delegation token for sqoop2 user. Hadoop documentation contains this very relevant snipnet:

      If the cluster is running in Secure Mode, the superuser must have kerberos credentials to be able to impersonate another user. It cannot use delegation tokens for this feature.

      Hence in order to do impersonation properly on secured cluster, we will have to do some dark magic with delegation tokens and retrieve DT for the end user inside the HDFS initialization and pass them to the execution engine.

      Attachments

        1. SQOOP-2709.patch
          18 kB
          Jarek Jarcec Cecho
        2. SQOOP-2709.patch
          18 kB
          Jarek Jarcec Cecho

        Issue Links

          Activity

            People

              jarcec Jarek Jarcec Cecho
              jarcec Jarek Jarcec Cecho
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: