[SPARK-18148] Misleading Error Message for Aggregation Without Window/GroupBy - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.0.0
Fix Version/s: 2.0.2, 2.1.0
Component/s: SQL
Labels:
None
Environment:

Databricks

Description

The following error message points to a random column I'm not actually using in my query, making it hard to diagnose.

org.apache.spark.sql.AnalysisException: expression '`randomColumn`' is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() (or first_value) if you don't care which value you get.;

Note in the code below, I forgot to add .over(weeklyWindow) in the line for withColumn("user_count"...

spark.read.load("/some-data")
  .withColumn("date_dt", to_date($"date"))
  .withColumn("year", year($"date_dt"))
  .withColumn("week", weekofyear($"date_dt"))
  .withColumn("user_count", count($"userId"))
  .withColumn("daily_max_in_week", max($"user_count").over(weeklyWindow))
)

CC: marmbrus

Attachments

Issue Links

links to

[Github] Pull Request #15672 (jiangxb1987)

Activity

People

Assignee:: Xingbo Jiang

Reporter:: Pat McDonough

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 28/Oct/16 01:43

Updated:: 01/Nov/16 20:25

Resolved:: 01/Nov/16 18:25