Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-29604

SessionState is initialized with isolated classloader for Hive if spark.sql.hive.metastore.jars is being set

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.4.4, 3.0.0
    • 2.4.5, 3.0.0
    • SQL
    • None

    Description

      I've observed the issue that external listeners cannot be loaded properly when we run spark-sql with "spark.sql.hive.metastore.jars" configuration being used.

      Exception in thread "main" java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':
      	at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1102)
      	at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:154)
      	at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:153)
      	at scala.Option.getOrElse(Option.scala:121)
      	at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:153)
      	at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:150)
      	at org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104)
      	at org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104)
      	at scala.Option.map(Option.scala:146)
      	at org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:104)
      	at org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:103)
      	at org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:149)
      	at org.apache.spark.sql.hive.client.HiveClientImpl.org$apache$spark$sql$hive$client$HiveClientImpl$$client(HiveClientImpl.scala:282)
      	at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:306)
      	at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:247)
      	at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:246)
      	at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:296)
      	at org.apache.spark.sql.hive.client.HiveClientImpl.databaseExists(HiveClientImpl.scala:386)
      	at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:215)
      	at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215)
      	at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215)
      	at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
      	at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:214)
      	at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:114)
      	at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:102)
      	at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:53)
      	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.<init>(SparkSQLCLIDriver.scala:315)
      	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:166)
      	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
      	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:847)
      	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
      	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
      	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
      	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:922)
      	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:931)
      	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
      Caused by: org.apache.spark.SparkException: Exception when registering StreamingQueryListener
      	at org.apache.spark.sql.streaming.StreamingQueryManager.<init>(StreamingQueryManager.scala:70)
      	at org.apache.spark.sql.internal.BaseSessionStateBuilder.streamingQueryManager(BaseSessionStateBuilder.scala:260)
      	at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:296)
      	at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1099)
      	... 40 more
      Caused by: java.lang.ClassNotFoundException: com.hortonworks.spark.atlas.SparkAtlasStreamingQueryEventTracker
      	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
      	at java.lang.Class.forName0(Native Method)
      	at java.lang.Class.forName(Class.java:348)
      	at org.apache.spark.util.Utils$.classForName(Utils.scala:193)
      	at org.apache.spark.util.Utils$$anonfun$loadExtensions$1.apply(Utils.scala:2640)
      	at org.apache.spark.util.Utils$$anonfun$loadExtensions$1.apply(Utils.scala:2638)
      	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
      	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
      	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
      	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
      	at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
      	at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
      	at org.apache.spark.util.Utils$.loadExtensions(Utils.scala:2638)
      	at org.apache.spark.sql.streaming.StreamingQueryManager$$anonfun$1.apply(StreamingQueryManager.scala:62)
      	at org.apache.spark.sql.streaming.StreamingQueryManager$$anonfun$1.apply(StreamingQueryManager.scala:61)
      	at scala.Option.foreach(Option.scala:257)
      	at org.apache.spark.sql.streaming.StreamingQueryManager.<init>(StreamingQueryManager.scala:61)
      	... 43 more
       
      

      Attachments

        Activity

          People

            kabhwan Jungtaek Lim
            kabhwan Jungtaek Lim
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: