Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0
-
CDH 5.10.0
-
ghx-label-7
Description
When combining a full outer join with a left join, some of the left join predicates seem to be treated as general WHERE-clauses, which leads to missing rows. Minimal working example:
create table A (a int, av int); create table B (a int, bv int); create table C (a int, cv int); insert into A values (1,1), (2,2), (3,3); insert into B values (2,22),(4,44); insert into C values (2,222); -- all results are returned as expected select * from A full outer join B on a.a=b.a left join C on (coalesce(a.a, b.a)=c.a); -- only one row is returned, as if the last clause was a WHERE clause select * from A full outer join B on a.a=b.a left join C on (coalesce(a.a, b.a)=c.a and coalesce(a.av,b.bv)=2); -- no rows are returned at all, even though only the columns of C should be affected select * from A full outer join B on a.a=b.a left join C on (coalesce(a.a, b.a)=c.a and coalesce(a.av,b.bv)=100); -- removing the full outer join leads to the expected result select * from A left join C on (coalesce(a.a)=c.a and coalesce(a.av)=100);
Running the exact same SQL in PostgreSQL, only the columns of C are ever affected by the left join ON condition, the number of rows never changes. As far as we understand, this should be the expected behaviour.