Description
SPARK-36879 implements the DELTA_BINARY_PACKED encoding which is for integer values, but does not implement the DELTA_BYTE_ARRAY encoding which is for string values. DELTA_BYTE_ARRAY encoding also requires the DELTA_LENGTH_BYTE_ARRAY encoding. Both these encodings need vectorized versions as the current implementation simply calls the non-vectorized Parquet library methods.
Attachments
Issue Links
- incorporates
-
SPARK-38169 Use OffHeap memory if configured in vectorized DeltaByteArray reader
-
- Open
-
- is part of
-
SPARK-36879 Support Parquet v2 data page encodings for the vectorized path
-
- Resolved
-
- links to