Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18627

Cannot connect to Hive metastore in client mode with proxy user

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 2.0.0
    • Fix Version/s: None
    • Component/s: YARN
    • Labels:
      None

      Description

      Marking as "minor" since the security story for client mode with proxy users is a little sketchy to start with, but it shouldn't fail, at least not in this manner. Error you get is:

      Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: GSS initiate failed
              at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
              at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316)
              at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
              at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
              at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
              at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
              at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:430)
              at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:240)
              at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
              at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
              at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
              at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
              at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
              at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1528)
              at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:67)
              at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:82)
              at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3238)
              at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3257)
              at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3482)
              at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:225)
              at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:209)
              at org.apache.hadoop.hive.ql.metadata.Hive.<init>(Hive.java:332)
              at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:293)
              at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:268)
              at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:529)
              at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:204)
              at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
              at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
              at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
              at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
              at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:249)
      

      Cluster mode works fine.

        Issue Links

          Activity

          Hide
          mkazia Mubashir Kazia added a comment -

          Spark connects to and uses HMS not HS2. It fetches a delegation token for HMS (https://github.com/apache/spark/blob/branch-2.0/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L376) but stores it in the credential cache with the label for HS2. The client (driver/executer) is not aware that the delegation token exists for HMS so it tries to authenticate with Kerberos/GSSAPI and cannot find the TGT/Service tickets in the Kerberos ticket cache and hence the error.

          The problematic code is here https://github.com/apache/spark/blob/branch-2.0/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala#L147
          It should probably be
          credentials.addToken(new Text(org.apache.hadoop.hive.thrift.DelegationTokenIdentifier.HIVE_DELEGATION_KIND), _)

          The title of this jira is also incorrect. It is successfully able to fetch HMS delegation token in the spark-submit code, it is the connecting to the HMS from the Driver that fails.

          Show
          mkazia Mubashir Kazia added a comment - Spark connects to and uses HMS not HS2. It fetches a delegation token for HMS ( https://github.com/apache/spark/blob/branch-2.0/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L376 ) but stores it in the credential cache with the label for HS2. The client (driver/executer) is not aware that the delegation token exists for HMS so it tries to authenticate with Kerberos/GSSAPI and cannot find the TGT/Service tickets in the Kerberos ticket cache and hence the error. The problematic code is here https://github.com/apache/spark/blob/branch-2.0/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala#L147 It should probably be credentials.addToken(new Text(org.apache.hadoop.hive.thrift.DelegationTokenIdentifier.HIVE_DELEGATION_KIND), _) The title of this jira is also incorrect. It is successfully able to fetch HMS delegation token in the spark-submit code, it is the connecting to the HMS from the Driver that fails.
          Hide
          vanzin Marcelo Vanzin added a comment -

          The title of this jira is also incorrect. It is successfully able to fetch HMS delegation token

          Right, changed.

          Spark connects to and uses HMS not HS2. It fetches a delegation token for HMS...but stores it in the credential cache with the label for HS2

          Although that does look fishy, it doesn't explain why that works in cluster mode. In fact, client mode shouldn't even need delegation tokens for the HMS, although with a proxy user maybe things get a little messed up.

          Show
          vanzin Marcelo Vanzin added a comment - The title of this jira is also incorrect. It is successfully able to fetch HMS delegation token Right, changed. Spark connects to and uses HMS not HS2. It fetches a delegation token for HMS...but stores it in the credential cache with the label for HS2 Although that does look fishy, it doesn't explain why that works in cluster mode. In fact, client mode shouldn't even need delegation tokens for the HMS, although with a proxy user maybe things get a little messed up.
          Hide
          mkazia Mubashir Kazia added a comment -

          it doesn't explain why that works in cluster mode.

          Just a wild guess: it probably also falls back to GSSAPI auth, but it is successful because there is a yarn TGT available in the UGI/subject/ticket cache because of the AMDelegationTokenRenewer. I can't confirm this is the case because the HMS Audit logging does not log the real user.

          In fact, client mode shouldn't even need delegation tokens for the HMS, although with a proxy user maybe things get a little messed up.

          Agree.

          Show
          mkazia Mubashir Kazia added a comment - it doesn't explain why that works in cluster mode. Just a wild guess: it probably also falls back to GSSAPI auth, but it is successful because there is a yarn TGT available in the UGI/subject/ticket cache because of the AMDelegationTokenRenewer. I can't confirm this is the case because the HMS Audit logging does not log the real user. In fact, client mode shouldn't even need delegation tokens for the HMS, although with a proxy user maybe things get a little messed up. Agree.

            People

            • Assignee:
              Unassigned
              Reporter:
              vanzin Marcelo Vanzin
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:

                Development