Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.1.0
-
None
Description
Unlike many databases, Spark SQL allows usage of FIRST and LAST in non-analytic contexts.
At the moment FIRST
> first(expr[, isIgnoreNull]) - Returns the first value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values.
and LAST
> last(expr[, isIgnoreNull]) - Returns the last value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values.
descriptions, suggest that their behavior is deterministic and many users assume that it return specific values for example when query
SELECT first(foo) FROM ( SELECT * FROM table ORDER BY bar )
That however doesn't seem to be the case.
To make situation worse, it seems to work (for example on small samples in local mode).
Attachments
Issue Links
- links to