Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-14294

HiveSchemaConverter for Parquet doesn't translate TINYINT and SMALLINT into proper Parquet types

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.2.1, 2.1.0
    • Fix Version/s: 2.3.0
    • Component/s: None
    • Labels:
      None
    • Target Version/s:

      Description

      To reproduce this issue, run the following DDL:

      CREATE TABLE foo STORED AS PARQUET AS SELECT CAST(1 AS TINYINT);
      

      And then check the schema of the written Parquet file:

      $ parquet-schema $WAREHOUSE_PATH/foo/000000_0
      message hive_schema {
        optional int32 _c0;
      }
      

      When translating Hive types into Parquet types, TINYINT and SMALLINT should be translated into the int32 (INT_8) and int32 (INT_16) respectively. However, HiveSchemaConverter converts all of TINYINT, SMALLINT, and INT into Parquet int32. This causes problem when accessing Parquet files generated by Hive in other systems since type information gets wrong.

        Attachments

        1. HIVE-14294.patch
          4 kB
          Gabor Szadovszky

          Issue Links

            Activity

              People

              • Assignee:
                gszadovszky Gabor Szadovszky
                Reporter:
                lian cheng Cheng Lian
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: