Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
The current implementation of Parquet serialisation from Thrift Definitions results in the incorrect conversion of Thrift byte fields into INT32 without preserving the required LogicalType Metadata in the Parquet file. This behaviour leads to a loss of information and is inconsistent with the expected behaviour. The correct conversion should result in INT32 with LogicalType metadata indicating a bit width of 8 and signed as true.
Thrift Definition
struct TestLogicalType {
1: required i16 test_i16,
2: required byte test_i8
}
Current Parquet Schema Generated
message ParquetSchema {
required int32 test_i16 (INTEGER(16,true)) = 1;
required int32 test_i8 = 2;
}
Expected Parquet Schema
message ParquetSchema { required int32 test_i16 (INTEGER(16,true)) = 1; required int32 test_i8 (INTEGER(8,true)) = 2; }
Attachments
Issue Links
- links to