Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-4296

Throw "Expression not in GROUP BY" when using same expression in group by clause and select clause

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 1.1.0, 1.1.1, 1.2.0
    • 1.2.1, 1.3.0
    • SQL
    • None

    Description

      When the input data has a complex structure, using same expression in group by clause and select clause will throw "Expression not in GROUP BY".

      val sqlContext = new org.apache.spark.sql.SQLContext(sc)
      import sqlContext.createSchemaRDD
      case class Birthday(date: String)
      case class Person(name: String, birthday: Birthday)
      val people = sc.parallelize(List(Person("John", Birthday("1990-01-22")), Person("Jim", Birthday("1980-02-28"))))
      people.registerTempTable("people")
      val year = sqlContext.sql("select count(*), upper(birthday.date) from people group by upper(birthday.date)")
      year.collect
      

      Here is the plan of year:

      SchemaRDD[3] at RDD at SchemaRDD.scala:105
      == Query Plan ==
      == Physical Plan ==
      org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Expression not in GROUP BY: Upper(birthday#1.date AS date#9) AS c1#3, tree:
      Aggregate [Upper(birthday#1.date)], [COUNT(1) AS c0#2L,Upper(birthday#1.date AS date#9) AS c1#3]
       Subquery people
        LogicalRDD [name#0,birthday#1], MapPartitionsRDD[1] at mapPartitions at ExistingRDD.scala:36
      

      The bug is the equality test for `Upper(birthday#1.date)` and `Upper(birthday#1.date AS date#9)`.

      Maybe Spark SQL needs a mechanism to compare Alias expression and non-Alias expression.

      Attachments

        Issue Links

          Activity

            People

              lian cheng Cheng Lian
              zsxwing Shixiong Zhu
              Votes:
              1 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: