Details
Description
Currently, Avro schema evolution done from Impala needs two steps:
ALTER TABLE avro_table SET TBLPROPERTIES('avro.schema.url'='hdfs:///tmp/evolved.avsc'); ALTER TABLE avro_table ADD COLUMNS(c3 string);
It is possible (in several ways) to get Impala to a bad state if this is not done exactly as above, e.g. switching the order of the statements and computing stats, or doing:
ALTER TABLE avro_table SET TBLPROPERTIES('avro.schema.url'='hdfs:///tmp/evolved.avsc'); REFRESH avro_table; COMPUTE STATS avro_table;
will result in the error:
ERROR: AnalysisException: Cannot COMPUTE STATS on Avro table 'avro_table' because its column definitions do not match those in the Avro schema.
Missing column definition corresponding to Avro-schema column 'thefield3' of type 'STRING' at position '1'.
Please re-create the table with column definitions, e.g., using the result of 'SHOW CREATE TABLE'
Instead of this message, it would be good to describe how to do schema evolution properly (even better: provide what step is missing).