Description
Parquet 1.11 supports column indexing. Spark can supports this feature for better read performance.
More details:
https://issues.apache.org/jira/browse/PARQUET-1201
Benchmark result:
https://github.com/apache/spark/pull/31393#issuecomment-769767724
This feature is enabled by default, and users can disable it by setting parquet.filter.columnindex.enabled to false.
Attachments
Issue Links
- relates to
-
SPARK-34859 Vectorized parquet reader needs synchronization among pages for column index
- Resolved
There are no Sub-Tasks for this issue.