Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Not A Problem
-
1.4.1
-
None
-
None
-
Ubuntu on AWS
Description
ALTER TABLE tbl CHANGE cannot find a column that DESCRIBE COLUMN lists.
In the case of a table generated with HiveContext.read.json(), the output of DESCRIBE dimension_components is:
comp_config struct<adText:string,adTextLeft:string,background:string,brand:string,button_color:string,cta_side:string,cta_type:string,depth:string,fixed_under:string,light:string,mid_text:string,oneline:string,overhang:string,shine:string,style:string,style_secondary:string,style_small:string,type:string>
comp_criteria string
comp_data_model string
comp_dimensions struct<data:string,integrations:array<string>,template:string,variation:bigint>
comp_disabled boolean
comp_id bigint
comp_path string
comp_placementData struct<mod:string>
comp_slot_types array<string>
However, alter table dimension_components change comp_dimensions comp_dimensions struct<data:string,integrations:array<string>,template:string,variation:bigint,z:string>; fails with:
15/08/08 23:13:07 ERROR exec.DDLTask: org.apache.hadoop.hive.ql.metadata.HiveException: Invalid column reference comp_dimensions at org.apache.hadoop.hive.ql.exec.DDLTask.alterTable(DDLTask.java:3584) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:312) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:345) at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:326) at org.apache.spark.sql.hive.client.ClientWrapper.withHiveState(ClientWrapper.scala:155) at org.apache.spark.sql.hive.client.ClientWrapper.runHive(ClientWrapper.scala:326) at org.apache.spark.sql.hive.client.ClientWrapper.runSqlHive(ClientWrapper.scala:316) at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:473) ...
Meanwhile, SHOW COLUMNS in dimension_components lists two columns: col (which does not exist in the table) and z, which was just added.
This suggests that DDL operations in Spark SQL use table metadata inconsistently.
Full spark-sql output here.
Attachments
Issue Links
- is related to
-
SPARK-9764 Spark SQL uses table metadata inconsistently
- Closed