Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23209

HiveDelegationTokenProvider throws an exception if Hive jars are not the classpath

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.3.0
    • Fix Version/s: 2.3.0
    • Component/s: Spark Core
    • Labels:
      None
    • Environment:

      OSX, Java(TM) SE Runtime Environment (build 1.8.0_92-b14), Java HotSpot(TM) 64-Bit Server VM (build 25.92-b14, mixed mode)

    • Target Version/s:

      Description

      While doing some Hive-on-Spark testing against the Spark 2.3.0 release candidates we came across a bug (see HIVE-18436).

      Stack-trace:

      Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf
              at org.apache.spark.deploy.security.HadoopDelegationTokenManager.getDelegationTokenProviders(HadoopDelegationTokenManager.scala:68)
              at org.apache.spark.deploy.security.HadoopDelegationTokenManager.<init>(HadoopDelegationTokenManager.scala:54)
              at org.apache.spark.deploy.yarn.security.YARNHadoopDelegationTokenManager.<init>(YARNHadoopDelegationTokenManager.scala:44)
              at org.apache.spark.deploy.yarn.Client.<init>(Client.scala:123)
              at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1502)
              at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
              at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
              at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
              at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
              at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
      Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
              at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
              at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
              at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
              at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
              ... 10 more
      

      Looks like the bug was introduced by SPARK-20434. SPARK-20434 changed HiveDelegationTokenProvider so that it constructs o.a.h.hive.conf.HiveConf inside HiveCredentialProvider#hiveConf rather than trying to manually load the class via the class loader. Looks like with the new code the JVM tries to load HiveConf as soon as HiveDelegationTokenProvider is referenced. Since there is no try-catch around the construction of HiveDelegationTokenProvider a ClassNotFoundException is thrown, which causes spark-submit to crash. Spark's docs/running-on-yarn.md says "a Hive token will be obtained if Hive is on the classpath". This behavior would seem to contradict that.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                vanzin Marcelo Vanzin
                Reporter:
                stakiar Sahil Takiar
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: