Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
Description
The spec in parquet-format specifies that DELTA_BYTE_ARRAY is only supported for the physical type BYTE_ARRAY. Yet, parquet-mr also uses it to encode FIXED_LEN_BYTE_ARRAY.
So, I guess the spec should be updated to include FIXED_LEN_BYTE_ARRAY in the supported types of DELTA_BYTE_ARRAY encoding, or the code should be changed to no longer write this encoding for FIXED_LEN_BYTE_ARRAY.
I guess changing the spec is more prudent, given thatÂ
a) the encoding can make sense for FIXED_LEN_BYTE_ARRAY
and
b) there might already be countless files written with this encoding / type combination.