Description
For example, when we run
val originalDefaultSource = conf.defaultDataSourceName val rdd = sparkContext.parallelize((1 to 10).map(i => s"""{"a":$i, "b":"str${i}"}""")) val df = jsonRDD(rdd) conf.setConf(SQLConf.DEFAULT_DATA_SOURCE_NAME, "org.apache.spark.sql.json") // Save the df as a managed table (by not specifiying the path). df.saveAsTable("savedJsonTable") checkAnswer( sql("SELECT * FROM savedJsonTable tmp where tmp.a > 5"), df.collect()) // Drop table will also delete the data. sql("DROP TABLE savedJsonTable") conf.setConf(SQLConf.DEFAULT_DATA_SOURCE_NAME, originalDefaultSource)
We will get
query with predicates *** FAILED *** (85 milliseconds)
[info] org.apache.spark.sql.AnalysisException: cannot resolve 'tmp.a' given input columns a, b
[info] at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.failAnalysis(Analyzer.scala:78)
[info] at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$apply$18$$anonfun$apply$2.applyOrElse(Analyzer.scala:88)
[info] at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$apply$18$$anonfun$apply$2.applyOrElse(Analyzer.scala:85)
Attachments
Issue Links
- links to