Description
When you have a data source table and you run a "ALTER TABLE...ADD COLUMNS" query, Spark will save invalid metadata to the Hive metastore.
Namely, it will overwrite the table's schema with the data frame's schema; that is not desired for data source tables (where the schema is stored in a table property instead).
Moreover, if you use a newer metastore client where METASTORE_DISALLOW_INCOMPATIBLE_COL_TYPE_CHANGES is on by default, you actually get an exception:
InvalidOperationException(message:The following columns have types incompatible with the existing columns in their respective positions : c1) at org.apache.hadoop.hive.metastore.MetaStoreUtils.throwExceptionIfIncompatibleColTypeChange(MetaStoreUtils.java:615) at org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:133) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3704) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:3675) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99) at com.sun.proxy.$Proxy26.alter_table_with_environment_context(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table_with_environmentContext(HiveMetaStoreClient.java:402) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table_with_environmentContext(SessionHiveMetaStoreClient.java:309) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154) at com.sun.proxy.$Proxy27.alter_table_with_environmentContext(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:601)
That exception is handled by Spark in an odd way (see code in HiveExternalCatalog.scala) which still stores invalid metadata.