Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Invalid
-
2.4.0
-
None
-
None
Description
I think we should configure the Parquet buffer size when using Parquet format.
Because for HDFS, `dfs.block.size` is configurable, sometimes we hope the block size of parquet to be consistent with it.
And whether this parameter `spark.sql.files.maxPartitionBytes` is best consistent with the Parquet block size when using Parquet format?