Details
Description
Spark 1.4:
import org.apache.spark.sql.functions._ val df = Seq(("x", (1,1)), ("y", (2, 2))).toDF("a", "b") df.groupBy("b._1").agg(sum("b._2")).collect() df: org.apache.spark.sql.DataFrame = [a: string, b: struct<_1:int,_2:int>] res0: Array[org.apache.spark.sql.Row] = Array([1,1], [2,2])
Spark 1.5
org.apache.spark.sql.AnalysisException: expression 'b' is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() if you don't care which value you get.; at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:37) at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:44) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1(CheckAnalysis.scala:110)