Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
BinaryStatistics currently only have a min/max, which are compared as signed byte[]. However, for real UTF8-friendly lexicographic comparison, e.g. for string columns, we would want to calculate the BinaryStatistics based off of a comparator that treats the bytes as unsigned.
Attachments
Issue Links
- blocks
-
PARQUET-1134 Release Parquet format 2.4.0
- Resolved
-
PARQUET-392 Release Parquet-mr 1.9.0
- Resolved
- causes
-
PARQUET-1322 Statistics is not available for DECIMAL types
- Open
- is duplicated by
-
PARQUET-839 Min-max should be computed based on logical type
- Resolved
- is related to
-
SPARK-17213 Parquet String Pushdown for Non-Eq Comparisons Broken
- Resolved
- relates to
-
PARQUET-2016 Reference column_order field from column indexes
- Resolved
- links to