Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33307 Refactor GROUPING ANALYTICS
  3. SPARK-33229

Need to support partial CUBE/ROLLUP/GROUPING SETS and mixed case

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.0
    • 3.2.0
    • SQL
    • None

    Description

      How to reproduce this issue:

      create table test_cube using parquet as select id as a, id as b, id as c from range(10);
      select a, b, c, count(*) from test_cube group by 1, cube(2, 3);
      
      spark-sql> select a, b, c, count(*) from test_cube group by 1, cube(2, 3);
      20/10/23 06:31:51 ERROR SparkSQLDriver: Failed in [select a, b, c, count(*) from test_cube group by 1, cube(2, 3)]
      java.lang.UnsupportedOperationException
      	at org.apache.spark.sql.catalyst.expressions.GroupingSet.dataType(grouping.scala:35)
      	at org.apache.spark.sql.catalyst.expressions.GroupingSet.dataType$(grouping.scala:35)
      	at org.apache.spark.sql.catalyst.expressions.Cube.dataType(grouping.scala:60)
      	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkValidGroupingExprs$1(CheckAnalysis.scala:268)
      	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$12(CheckAnalysis.scala:284)
      	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$12$adapted(CheckAnalysis.scala:284)
      	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
      	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
      	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
      	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1(CheckAnalysis.scala:284)
      	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1$adapted(CheckAnalysis.scala:92)
      	at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:177)
      	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis(CheckAnalysis.scala:92)
      	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis$(CheckAnalysis.scala:89)
      	at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:130)
      	at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:156)
      	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:201)
      	at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:153)
      	at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:68)
      	at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
      	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:133)
      	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
      	at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:133)
      	at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:68)
      	at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:66)
      	at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:58)
      	at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99)
      

      Attachments

        Activity

          People

            angerszhuuu angerszhu
            yumwang Yuming Wang
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: