Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32361

Remove project if output is subset of child

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.1.0
    • None
    • SQL
    • None

    Description

      We can remove some redundant project after we completed pruning column.

      e.g.,

      create table t1(c1 int, c2 int) using parquet;
      
      explain extended
      select sum(c1) from (
        select * from t1
      );
      

      Currently we get this plan.

      == Physical Plan ==
      *(2) HashAggregate(keys=[], functions=[sum(cast(c1#19 as bigint))], output=[sum(c1)#68L])
      +- Exchange SinglePartition, true, [id=#86]
         +- *(1) HashAggregate(keys=[], functions=[partial_sum(cast(c1#19 as bigint))], output=[sum#70L])
            +- *(1) Project [c1#19]
               +- *(1) ColumnarToRow
                  +- FileScan parquet default.t1[c1#19] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex[hdfs:///user/hive/warehouse/t1], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<c1:int>
      

      We can remove the `Project`, like this

      == Physical Plan ==
      *(2) HashAggregate(keys=[], functions=[sum(cast(c1#19 as bigint))], output=[sum(c1)#68L])
      +- Exchange SinglePartition, true, [id=#86]
         +- *(1) HashAggregate(keys=[], functions=[partial_sum(cast(c1#19 as bigint))], output=[sum#70L])
             +- *(1) ColumnarToRow
                +- FileScan parquet default.t1[c1#19] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex[hdfs:///user/hive/warehouse/t1], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<c1:int>
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            ulysses XiDuo You
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: