Description
_metadata.columns are taken into account in FileSourceScanExec.supportColumnar, but not when the parquet reader is created. This can result in Parquet reader outputting columnar (because it has less columns than WSCG.isTooManyFields), whereas FileSourceScanExec wants row output (because with the extra metadata columns it hits the isTooManyFields limit).
I have a fix forthcoming.
Attachments
Issue Links
- is related to
-
ORC-1578 Fix SparkBenchmark according to SPARK-40918
- Closed
- links to