Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.4.0
Description
Parquet footer is now read twice even if there are no filters requiring pushdown in vectorized parquet reader.
When the NameNode is under high pressure, it will cost time to read twice. Actually we can avoid this unnecessary parquet footer reads and use footer metadata inĀ VectorizedParquetRecordReader.
Attachments
Issue Links
- causes
-
SPARK-48950 Corrupt data from parquet scans
- Open
- relates to
-
SPARK-48571 Reduce the number of accesses to S3 object storage
- Open
- links to