Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-14294

HiveSchemaConverter for Parquet doesn't translate TINYINT and SMALLINT into proper Parquet types

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.2.1, 2.1.0
    • 2.3.0
    • None
    • None

    Description

      To reproduce this issue, run the following DDL:

      CREATE TABLE foo STORED AS PARQUET AS SELECT CAST(1 AS TINYINT);
      

      And then check the schema of the written Parquet file:

      $ parquet-schema $WAREHOUSE_PATH/foo/000000_0
      message hive_schema {
        optional int32 _c0;
      }
      

      When translating Hive types into Parquet types, TINYINT and SMALLINT should be translated into the int32 (INT_8) and int32 (INT_16) respectively. However, HiveSchemaConverter converts all of TINYINT, SMALLINT, and INT into Parquet int32. This causes problem when accessing Parquet files generated by Hive in other systems since type information gets wrong.

      Attachments

        1. HIVE-14294.patch
          4 kB
          Gabor Szadovszky

        Issue Links

          Activity

            People

              gszadovszky Gabor Szadovszky
              lian cheng Cheng Lian
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: