[SPARK-27653] Add max_by() / min_by() SQL aggregate functions - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.0.0
Fix Version/s: 3.0.0
Component/s: SQL
Labels:
None

Description

It would be useful if Spark SQL supported the max_by() SQL aggregate function. Quoting from the Presto docs:

max_by(x, y) → [same as x]
Returns the value of x associated with the maximum value of y over all input values.

min_by works similarly.

Technically I can emulate this behavior using window functions but the resulting syntax is much more verbose and non-intuitive compared to max_by / min_by.

Attachments

Issue Links

relates to

SPARK-36963 Add max_by/min_by to sql.functions

Resolved

links to

GitHub Pull Request #24557

GitHub Pull Request #26264

Activity

People

Assignee:: L. C. Hsieh

Reporter:: Josh Rosen

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 07/May/19 22:54

Updated:: 09/Oct/21 03:02

Resolved:: 13/May/19 14:39