[SPARK-34882] RewriteDistinctAggregates can cause a bug if the aggregator does not ignore NULLs - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.4.8, 3.0.3, 3.1.2, 3.2.0
Fix Version/s: 3.2.0
Component/s: SQL
Labels:
- correctness

Description

group-by.sql

SELECT
    first(DISTINCT a), last(DISTINCT a),
    first(a), last(a),
    first(DISTINCT b), last(DISTINCT b),
    first(b), last(b)
FROM testData WHERE a IS NOT NULL AND b IS NOT NULL;

group-by.sql.out

-- !query schema
struct<first(DISTINCT a):int,last(DISTINCT a):int,first(a):int,last(a):int,first(DISTINCT b):int,last(DISTINCT b):int,first(b):int,last(b):int>
-- !query output
NULL	1	1	3	1	NULL	1	2

The results should not be NULL, because NULL inputs are filtered out.

Attachments

Issue Links

links to

[Github] Pull Request #31983 (tanelk)

Activity

People

Assignee:: Tanel Kiis

Reporter:: Tanel Kiis

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 28/Mar/21 19:09

Updated:: 31/Mar/21 22:44

Resolved:: 31/Mar/21 22:44