Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
2.4.0, 3.0.0
-
None
-
None
Description
``` test("xxxxxxxx") { Seq( (1, 3), (2, 3), (3, 6), (4, 7), (5, 9), (6, 9) ).toDF("a", "b").createOrReplaceTempView("testData") val x = sql( """ | SELECT b, sum(a) as a | FROM testData | GROUP BY b | HAVING sum(a) > 3 """.stripMargin) x.explain() x.show() } [info] - xxxxxxxx *** FAILED *** (508 milliseconds) [info] org.apache.spark.sql.AnalysisException: Resolved attribute(s) a#184 missing from a#180,b#181 in operator !Aggregate [b#181], [b#181, sum(cast(a#180 as double)) AS a#184, sum(a#184) AS sum(a#184)#188]. Attribute(s) with the same name appear in the operation: a. Please check if the right attribute(s) are used.;; [info] Project [b#181, a#184] [info] +- Filter (sum(a#184)#188 > cast(3 as double)) [info] +- !Aggregate [b#181], [b#181, sum(cast(a#180 as double)) AS a#184, sum(a#184) AS sum(a#184)#188] [info] +- SubqueryAlias `testdata` [info] +- Project [_1#177 AS a#180, _2#178 AS b#181] [info] +- LocalRelation [_1#177, _2#178] ``` ``` test("xxxxxxxx") { Seq( ("1", "3"), ("2", "3"), ("3", "6"), ("4", "7"), ("5", "9"), ("6", "9") ).toDF("a", "b").createOrReplaceTempView("testData") val x = sql( """ | SELECT b, sum(a) as a | FROM testData | GROUP BY b | HAVING sum(a) > 3 """.stripMargin) x.explain() x.show() } == Physical Plan == *(2) Project [b#181, a#184L] +- *(2) Filter (isnotnull(sum(cast(a#180 as bigint))#197L) && (sum(cast(a#180 as bigint))#197L > 3)) +- *(2) HashAggregate(keys=[b#181], functions=[sum(cast(a#180 as bigint))]) +- Exchange hashpartitioning(b#181, 5) +- *(1) HashAggregate(keys=[b#181], functions=[partial_sum(cast(a#180 as bigint))]) +- *(1) Project [_1#177 AS a#180, _2#178 AS b#181] +- LocalTableScan [_1#177, _2#178] ```
Spend A lot of time I can't find witch analyzer make this different,
When column type is double, it failed.
Attachments
Issue Links
- duplicates
-
SPARK-31519 Cast in having aggregate expressions returns the wrong result
- Resolved
- links to