[SPARK-35346] More clause needed for combining groupby and cube - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 3.0.0, 3.0.2, 3.1.1
Fix Version/s: None
Component/s: PySpark, SQL
Labels:
None

Description

As we all know, aggregation clause must follow after groupby, rollup or cube clause in pyspark. I think we should have more features in this part. Because in sql, we can write it like this "group by xxx, xxx, cube(xxx,xxx)". While in pyspark, if you just need cube for one field and group for the others, it's not gonna happen. Using cube for all fields brings much more cost for useless data. So I think we need to improve it. Thank you!

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Kai

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 08/May/21 03:39

Updated:: 12/Dec/22 18:10