Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4247

Improve Avro schema evolution related logging

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • Impala 2.6.0
    • None
    • Frontend

    Description

      Currently, Avro schema evolution done from Impala needs two steps:

      ALTER TABLE avro_table SET TBLPROPERTIES('avro.schema.url'='hdfs:///tmp/evolved.avsc');
      ALTER TABLE avro_table ADD COLUMNS(c3 string);
      

      It is possible (in several ways) to get Impala to a bad state if this is not done exactly as above, e.g. switching the order of the statements and computing stats, or doing:

      ALTER TABLE avro_table SET TBLPROPERTIES('avro.schema.url'='hdfs:///tmp/evolved.avsc');
      REFRESH avro_table;
      COMPUTE STATS avro_table;
      

      will result in the error:
      ERROR: AnalysisException: Cannot COMPUTE STATS on Avro table 'avro_table' because its column definitions do not match those in the Avro schema.
      Missing column definition corresponding to Avro-schema column 'thefield3' of type 'STRING' at position '1'.
      Please re-create the table with column definitions, e.g., using the result of 'SHOW CREATE TABLE'

      Instead of this message, it would be good to describe how to do schema evolution properly (even better: provide what step is missing).

      Attachments

        Activity

          People

            Unassigned Unassigned
            balazsj_impala_220b Balazs Jeszenszky
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: