Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38573

Support Auto Partition Statistics Collection

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • SQL
    • None

    Description

      Currently https://issues.apache.org/jira/browse/SPARK-21127 supports storing the aggregated stats at table level for partitioned tables with config spark.sql.statistics.size.autoUpdate.enabled.

      Supporting partition level stats are useful to know which partitions are outliers (skewed partition) and query optimizer works better with partition level stats in case of partition pruning.

       

      Attachments

        Issue Links

          Activity

            People

              kazuyukitanimura Kazuyuki Tanimura
              kazuyukitanimura Kazuyuki Tanimura
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: