[SPARK-47070] Subquery rewrite inside an aggregation makes an aggregation invalid - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.5.0
Fix Version/s: 4.0.0
Component/s: Spark Core
Labels:
- pull-request-available

Description

When an in/exists-subquery appears inside an aggregate expression within a top-level GROUP BY, it gets rewritten and a new `exists` variable is introduced. However, this variable is incorrectly handled in aggregation. For example, consider the following query:

```
SELECT
CASE
WHEN t1.id IN (SELECT id FROM t2) THEN 10
ELSE -10
END AS v1
FROM t1
GROUP BY t1.id;
```

Executing it leads to the following error:
```
java.lang.IllegalArgumentException: Cannot find column index for attribute 'exists#844' in: Map()
```

Attachments

Issue Links

links to

GitHub Pull Request #45133

GitHub Pull Request #45412

Activity

People

Assignee:: Anton Kirillov

Reporter:: Anton Lykov

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 16/Feb/24 04:48

Updated:: 07/Mar/24 01:56

Resolved:: 04/Mar/24 14:54