Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22463

Missing hadoop/hive/hbase/etc configuration files in SPARK_CONF_DIR to distributed archive

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.1.2, 2.2.0
    • Fix Version/s: 2.3.0
    • Component/s: YARN
    • Labels:
      None

      Description

      When I ran self contained sql apps, such as

      import org.apache.spark.sql.SparkSession
      
      object ShowHiveTables {
        def main(args: Array[String]): Unit = {
          val spark = SparkSession
            .builder()
            .appName("Show Hive Tables")
            .enableHiveSupport()
            .getOrCreate()
          spark.sql("show tables").show()
          spark.stop()
        }
      }
      

      with *yarn cluster* mode and `hive-site.xml` correctly within `$SPARK_HOME/conf`,they failed to connect the right hive metestore for not seeing hive-site.xml in AM/Driver's classpath.

      Although submitting them with `-files/-jars local/path/to/hive-site.xml` or puting it to `$HADOOP_CONF_DIR/YARN_CONF_DIR` can make these apps works well in cluster mode as client mode, according to the official doc, see @ http://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables
      > Configuration of Hive is done by placing your hive-site.xml, core-site.xml (for security configuration), and hdfs-site.xml (for HDFS configuration) file in conf/.

      We may respect these configuration files too or modify the doc for hive-tables in cluster mode.

        Attachments

          Activity

            People

            • Assignee:
              Qin Yao Kent Yao
              Reporter:
              Qin Yao Kent Yao
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: