[SPARK-26345] Parquet support Column indexes - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Umbrella
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.1.0
Fix Version/s: 3.2.0
Component/s: SQL
Labels:
None

Description

Parquet 1.11 supports column indexing. Spark can supports this feature for better read performance.

More details:

https://issues.apache.org/jira/browse/PARQUET-1201

Benchmark result:

https://github.com/apache/spark/pull/31393#issuecomment-769767724

This feature is enabled by default, and users can disable it by setting parquet.filter.columnindex.enabled to false.

Attachments

Issue Links

relates to

SPARK-34859 Vectorized parquet reader needs synchronization among pages for column index

Resolved

Sub-Tasks

There are no Sub-Tasks for this issue.

Activity

People

Assignee:: Yuming Wang

Reporter:: Yuming Wang

Votes:: 5 Vote for this issue

Watchers:: 19 Start watching this issue

Dates

Created:: 12/Dec/18 09:08

Updated:: 13/Jan/23 07:06

Resolved:: 31/Jan/21 02:06