Description
To reproduce this issue, run the following DDL:
CREATE TABLE foo STORED AS PARQUET AS SELECT CAST(1 AS TINYINT);
And then check the schema of the written Parquet file:
$ parquet-schema $WAREHOUSE_PATH/foo/000000_0 message hive_schema { optional int32 _c0; }
When translating Hive types into Parquet types, TINYINT and SMALLINT should be translated into the int32 (INT_8) and int32 (INT_16) respectively. However, HiveSchemaConverter converts all of TINYINT, SMALLINT, and INT into Parquet int32. This causes problem when accessing Parquet files generated by Hive in other systems since type information gets wrong.
Attachments
Attachments
Issue Links
- relates to
-
SPARK-16632 Vectorized parquet reader fails to read certain fields from Hive tables
- Resolved
- links to