Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9738

incompatible Parquet schema for column "ex: x is of type String" Column type: STRING, Parquet schema:

    XMLWordPrintableJSON

    Details

    • Type: Question
    • Status: Resolved
    • Priority: Critical
    • Resolution: Invalid
    • Affects Version/s: Impala 2.12.0
    • Fix Version/s: Impala 2.12.0
    • Component/s: Clients
    • Labels:
      None
    • Environment:
      Test
    • Flags:
      Important
    • Epic Color:
      ghx-label-7

      Description

      I have an Existing External Table called for example A contains n number of columns and this table is loaded daily with partitioned column as extract_date.

      We got a request from business to add few more columns in the existing table. To implement this we have done following things. 

      alter table xxxx.yyyyyy add columns (`c10` string COMMENT '',`b` string COMMENT '',`c11` string COMMENT '',`c12` string COMMENT '',`c13` string COMMENT '',`c14` string COMMENT '',`c15` string COMMENT '') ;
      alter table xxxx.yyyyyyy change `c8` `c8` string COMMENT '' after `c7` ;

      After i did the above 2 steps, then i went to HIVE and did MSCK REPAIR TABLE xxxx.yyyyyy;

      Partitions added.(there are partitions from 2018).

       

      Before our change as mentioned above i can able to query the data bth from IMPALA and HIVE but after executing ALTER COMMANDS, I am facing an error as mentioned below. 

       

      select * from xxxx.yyyyyyy where extract_date like '2019%';
      Query: select * from XXXXX.YYYYYYY where extract_date like '2019%'
      Query submitted at: 2020-05-09 11:57:10 (Coordinator: ' xxxx.yyyyyyy .c9'. Column type: STRING, Parquet schema:
      optional fixed_len_byte_array a_auth [i:12 d:1 r:0]

       

      Where as in Hive same query i can able to browse the data. No issues. ONLY IN IMPALA ITS GIVING AN ISSUE.

       

      Troubleshooting steps:

      Created new table without additional columns and pointed the external path as new and copied the Previously created partitions to new path. 

      MSCK REPAIR TABLE TABLE NAME;

      Both in impala and Hive select query is working.

       

      2. Added additional fields to the newly created table with alter commands then did the following things

      MSCK REPAIR TABLE TABLE NAME;

      In Impala : REFRESH TABLE TABLE NAME;

      INVALIDATE METADATA TABLE NAME;

      This time in Hive select query worked but in Impala got the above mentioned Error. 

      Can some one guide me why this is happening and how to fix this issue.

       

      Impala Shell v2.12.0-cdh5.16.2

       

       

       

       

       

       

       

       

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              RKNAIDU RK
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: