Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26345

Parquet support Column indexes

    XMLWordPrintableJSON

Details

    • Umbrella
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.0
    • 3.2.0
    • SQL
    • None

    Description

      Parquet 1.11 supports column indexing. Spark can supports this feature for better read performance.

      More details:

      https://issues.apache.org/jira/browse/PARQUET-1201

       

      Benchmark result:

      https://github.com/apache/spark/pull/31393#issuecomment-769767724

      This feature is enabled by default, and users can disable it by setting parquet.filter.columnindex.enabled to false.

      Attachments

        Issue Links

          Activity

            People

              yumwang Yuming Wang
              yumwang Yuming Wang
              Votes:
              5 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: