Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33400

Normalize sameOrderExpressions in SortOrder

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.4.7, 3.0.0, 3.0.1
    • 3.1.0
    • SQL
    • None

    Description

      When a SortMergeJoin is followed by a Project with aliases, the outputOrdering is not propogated properly and in some cases, it leads to unrequired Sort operation:

       

       

      spark.range(10).repartition($"id").createTempView("t1")
      spark.range(20).repartition($"id").createTempView("t2")
      spark.range(30).repartition($"id").createTempView("t3")
      
      val planned = sql(
         """
           |SELECT t2id, t3.id as t3id
           |FROM (
           |    SELECT t1.id as t1id, t2.id as t2id
           |    FROM t1, t2
           |    WHERE t1.id = t2.id
           |) t12, t3
           |WHERE t1id = t3.id
         """.stripMargin).queryExecution.executedPlan
      
      
      
      *(8) Project [t2id#1059L, id#1004L AS t3id#1060L]
      +- *(8) SortMergeJoin [t2id#1059L], [id#1004L], Inner
         :- *(5) Sort [t2id#1059L ASC NULLS FIRST ], false, 0  <-----------------------
         :  +- *(5) Project [id#1000L AS t2id#1059L]
         :     +- *(5) SortMergeJoin [id#996L], [id#1000L], Inner
         :        :- *(2) Sort [id#996L ASC NULLS FIRST ], false, 0
         :        :  +- Exchange hashpartitioning(id#996L, 5), true, [id=#1426]
         :        :     +- *(1) Range (0, 10, step=1, splits=2)
         :        +- *(4) Sort [id#1000L ASC NULLS FIRST ], false, 0
         :           +- Exchange hashpartitioning(id#1000L, 5), true, [id=#1432]
         :              +- *(3) Range (0, 20, step=1, splits=2)
         +- *(7) Sort [id#1004L ASC NULLS FIRST ], false, 0
            +- Exchange hashpartitioning(id#1004L, 5), true, [id=#1443]
               +- *(6) Range (0, 30, step=1, splits=2)
      
      

      The above marked Sort node could have been avoided.

       

      Attachments

        Issue Links

          Activity

            People

              prakharjain09 Prakhar Jain
              prakharjain09 Prakhar Jain
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: