[SPARK-10167] We need to explicitly use transformDown when rewrite aggregation results - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.6.0
Component/s: SQL
Labels:
None

Description

Right now, we use transformDown explicitly at https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/utils.scala#L105 and https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/utils.scala#L130. We also need to be very clear on using transformDown at https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/utils.scala#L300 and https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/utils.scala#L334 (right now transform means transformDown). The reason we need to use transformDown is when we rewrite final aggregate results, we should always match aggregate functions first. If we use transformUp, it is possible that we match grouping expression first if we use grouping expressions as children of aggregate functions.

There is nothing wrong with our master. We just want to make sure we will not have bugs if we change the behavior of transform (change it from transformDown to Up.), which I think is very unlikely (but just incase).

Attachments

Activity

People

Assignee:: Josh Rosen

Reporter:: Yin Huai

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 22/Aug/15 00:54

Updated:: 09/Oct/15 21:21

Resolved:: 09/Oct/15 21:21