Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-42388

Avoid unnecessary parquet footer reads when no filters in vectorized reader

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.5.0
    • SQL
    • None

    Description

      Parquet footer is now read twice even if there are no filters requiring pushdown in vectorized parquet reader.
      When the NameNode is under high pressure, it will cost time to read twice. Actually we can avoid this unnecessary parquet footer reads and use footer metadata inĀ VectorizedParquetRecordReader.

      Attachments

        Activity

          People

            miracle Mars
            miracle Mars
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: