Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-5839

HiveMetastoreCatalog does not recognize table names and aliases of data source tables.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • 1.3.0
    • SQL
    • None

    Description

      For example, when we run

      val originalDefaultSource = conf.defaultDataSourceName
      
      val rdd = sparkContext.parallelize((1 to 10).map(i => s"""{"a":$i, "b":"str${i}"}"""))
      val df = jsonRDD(rdd)
      
      conf.setConf(SQLConf.DEFAULT_DATA_SOURCE_NAME, "org.apache.spark.sql.json")
      // Save the df as a managed table (by not specifiying the path).
      df.saveAsTable("savedJsonTable")
      
      checkAnswer(
        sql("SELECT * FROM savedJsonTable tmp where tmp.a > 5"),
        df.collect())
      
      // Drop table will also delete the data.
      sql("DROP TABLE savedJsonTable")
      
      conf.setConf(SQLConf.DEFAULT_DATA_SOURCE_NAME, originalDefaultSource)
      

      We will get

      query with predicates *** FAILED *** (85 milliseconds)
      [info]   org.apache.spark.sql.AnalysisException: cannot resolve 'tmp.a' given input columns a, b
      [info]   at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.failAnalysis(Analyzer.scala:78)
      [info]   at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$apply$18$$anonfun$apply$2.applyOrElse(Analyzer.scala:88)
      [info]   at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$apply$18$$anonfun$apply$2.applyOrElse(Analyzer.scala:85)
      

      Attachments

        Activity

          People

            yhuai Yin Huai
            yhuai Yin Huai
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: