Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21617

ALTER TABLE...ADD COLUMNS broken in Hive 2.1 for DS tables

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.2.0
    • Fix Version/s: 2.2.1, 2.3.0
    • Component/s: SQL
    • Labels:
      None

      Description

      When you have a data source table and you run a "ALTER TABLE...ADD COLUMNS" query, Spark will save invalid metadata to the Hive metastore.

      Namely, it will overwrite the table's schema with the data frame's schema; that is not desired for data source tables (where the schema is stored in a table property instead).

      Moreover, if you use a newer metastore client where METASTORE_DISALLOW_INCOMPATIBLE_COL_TYPE_CHANGES is on by default, you actually get an exception:

      InvalidOperationException(message:The following columns have types incompatible with the existing columns in their respective positions :
      c1)
      	at org.apache.hadoop.hive.metastore.MetaStoreUtils.throwExceptionIfIncompatibleColTypeChange(MetaStoreUtils.java:615)
      	at org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:133)
      	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3704)
      	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:3675)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
      	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
      	at com.sun.proxy.$Proxy26.alter_table_with_environment_context(Unknown Source)
      	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table_with_environmentContext(HiveMetaStoreClient.java:402)
      	at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table_with_environmentContext(SessionHiveMetaStoreClient.java:309)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154)
      	at com.sun.proxy.$Proxy27.alter_table_with_environmentContext(Unknown Source)
      	at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:601)
      

      That exception is handled by Spark in an odd way (see code in HiveExternalCatalog.scala) which still stores invalid metadata.

        Attachments

          Activity

            People

            • Assignee:
              vanzin Marcelo Vanzin
              Reporter:
              vanzin Marcelo Vanzin
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: