Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-35346

More clause needed for combining groupby and cube

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.0.0, 3.0.2, 3.1.1
    • None
    • PySpark, SQL
    • None

    Description

      As we all know, aggregation clause must follow after groupby, rollup or cube clause in pyspark. I think we should have more features in this part. Because in sql, we can write it like this "group by xxx, xxx, cube(xxx,xxx)". While in pyspark, if you just need cube for one field and group for the others, it's not gonna happen. Using cube for all fields brings much more cost for useless data. So I think we need to improve it. Thank you!

      Attachments

        Activity

          People

            Unassigned Unassigned
            wangjinjie722 Kai
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: