[SPARK-23973] Remove consecutive sorts - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 2.4.0
Fix Version/s: 2.4.0
Component/s: SQL
Labels:
None

Description

As a follow-on from ~~SPARK-23375~~, it would be easy to remove redundant sorts in the following kind of query:

Seq((1), (3)).toDF("int").orderBy('int.asc).orderBy('int.desc).explain()

== Physical Plan ==
*(2) Sort [int#35 DESC NULLS LAST], true, 0
+- Exchange rangepartitioning(int#35 DESC NULLS LAST, 200)
   +- *(1) Sort [int#35 ASC NULLS FIRST], true, 0
      +- Exchange rangepartitioning(int#35 ASC NULLS FIRST, 200)
         +- LocalTableScan [int#35]

There's no need to perform (1) Sort. Since the sort operator isn't stable, AFAIK, it should be ok to remove a sort on any column that gets 'overwritten' by a subsequent one in this way.

Attachments

Issue Links

causes

SPARK-33183 Bug in optimizer rule EliminateSorts

Resolved

links to

[Github] Pull Request #21072 (mgaido91)

Activity

People

Assignee:: Marco Gaido

Reporter:: Henry Robinson

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 12/Apr/18 23:46

Updated:: 16/Nov/20 20:22

Resolved:: 24/Apr/18 02:12