Description
Currently https://issues.apache.org/jira/browse/SPARK-21127 supports storing the aggregated stats at table level for partitioned tables with config spark.sql.statistics.size.autoUpdate.enabled.
Supporting partition level stats are useful to know which partitions are outliers (skewed partition) and query optimizer works better with partition level stats in case of partition pruning.
Attachments
Issue Links
- is related to
-
SPARK-21127 Update statistics after data changing commands
- Resolved
-
SPARK-33825 Is Spark SQL able to auto update partition stats like hive by setting hive.stats.autogather=true
- Resolved
- links to