Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-8020

Spark SQL conf in spark-defaults.conf make metadataHive get constructed too early

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 1.4.0
    • 1.4.0
    • SQL
    • None

    Description

      To correctly construct a metadataHive object, we need two settings, spark.sql.hive.metastore.version and spark.sql.hive.metastore.jars. If users want to use Hive 0.12's metastore, they need to set spark.sql.hive.metastore.version to 0.12.0 and set spark.sql.hive.metastore.jars to maven or a classpath containing Hive and Hadoop's jars. However, any spark sql setting in the spark-defaults.conf will trigger the construction of metadataHive and cause Spark SQL connect to the wrong metastore (e.g. connect to the local derby metastore instead of a remove mysql Hive 0.12 metastore). Also, if spark.sql.hive.metastore.version 0.12.0 is the first conf set to SQL conf, we will get

      Exception in thread "main" java.lang.IllegalArgumentException: Builtin jars can only be used when hive execution version == hive metastore version. Execution: 0.13.1 != Metastore: 0.12.0. Specify a vaild path to the correct hive jars using $HIVE_METASTORE_JARS or change spark.sql.hive.metastore.version to 0.13.1.
      	at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:186)
      	at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:175)
      	at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:358)
      	at org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:186)
      	at org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:185)
      	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
      	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
      	at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:185)
      	at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:71)
      	at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:53)
      	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.<init>(SparkSQLCLIDriver.scala:248)
      	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:136)
      	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
      	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
      	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
      	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
      	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
      
      

      Attachments

        Issue Links

          Activity

            People

              yhuai Yin Huai
              yhuai Yin Huai
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: