Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
4.0.0
Description
This query is not legal and should give an error, but instead we incorrectly allow it and it returns wrong results.
create table x(x1 int, x2 int); insert into x values (1, 1); create table y(y1 int, y2 int); insert into y values (2, 2), (3, 3); select *, (select count(*) from y where y1 > x1 group by y1) from x;
It returns two rows, even though there's only one row of x.
The correct result is an error: more than one row returned by a subquery used as an expression (as seen in postgres for example)
This is a longstanding bug. The bug is in CheckAnalysis in checkAggregateInScalarSubquery. It allows grouping columns that are present in correlation predicates, but doesn’t check whether those predicates are equalities - because when that code was written, non-equality correlation wasn’t allowed. Therefore, it looks like this bug has existed since non-equality correlation was added (~2 years ago).
Various other expressions that are not equi-joins between the inner and outer fields hit this too, e.g. `where y1 + y2 = x1 group by y1`.
Another bugged case is if the correlation condition is an equality but it's under another operator like an OUTER JOIN or UNION.