Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-34084

ALTER TABLE .. ADD PARTITION does not update table stats

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.2, 3.1.1, 3.2.0
    • Fix Version/s: 3.0.2, 3.1.1, 3.2.0
    • Component/s: SQL
    • Labels:
      None
    • Environment:

      strong text

      Description

      The example below portraits the issue:

      spark-sql> create table tbl (col0 int, part int) partitioned by (part);
      spark-sql> insert into tbl partition (part = 0) select 0;
      spark-sql> set spark.sql.statistics.size.autoUpdate.enabled=true;
      spark-sql> alter table tbl add partition (part = 1);
      

      There are no stats:

      spark-sql> describe table extended tbl;
      col0	int	NULL
      part	int	NULL
      # Partition Information
      # col_name	data_type	comment
      part	int	NULL
      
      # Detailed Table Information
      Database	default
      Table	tbl
      Owner	maximgekk
      Created Time	Tue Jan 12 12:00:03 MSK 2021
      Last Access	UNKNOWN
      Created By	Spark 3.2.0-SNAPSHOT
      Type	MANAGED
      Provider	hive
      Table Properties	[transient_lastDdlTime=1610442003]
      Location	file:/Users/maximgekk/proj/fix-stats-in-add-partition/spark-warehouse/tbl
      Serde Library	org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
      InputFormat	org.apache.hadoop.mapred.TextInputFormat
      OutputFormat	org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
      Storage Properties	[serialization.format=1]
      Partition Provider	Catalog
      

      As we can see there is no stats. For instance, ALTER TABLE .. DROP PARTITION updates stats:

      spark-sql> alter table tbl drop partition (part = 1);
      spark-sql> describe table extended tbl;
      col0	int	NULL
      part	int	NULL
      # Partition Information
      # col_name	data_type	comment
      part	int	NULL
      
      # Detailed Table Information
      ...
      Statistics	2 bytes
      

        Attachments

          Activity

            People

            • Assignee:
              maxgekk Max Gekk
              Reporter:
              maxgekk Max Gekk
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: