Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.3.2, 3.4.0, 3.5.0
Description
Assume this data:
create or replace temp view t1 as values (1), (2), (3) as (c1); create or replace temp view t2 as values (2), (3), (4) as (c1);
The following query produces incorrect results:
spark-sql (default)> select explode(array(t1.c1, t2.c1)) as x1 from t1 full outer join t2 using (c1); 1 -1 <== should be null 2 2 3 3 -1 <== should be null 4 Time taken: 0.663 seconds, Fetched 8 row(s) spark-sql (default)>
Similar issues occur with right outer join and left outer join.
t1.c1 and t2.c1 have the wrong nullability at the time the array is resolved, so the array's containsNull value is incorrect.
Queries that don't use arrays also can get wrong results. Assume this data:
create or replace temp view t1 as values (0), (1), (2) as (c1); create or replace temp view t2 as values (1), (2), (3) as (c1); create or replace temp view t3 as values (1, 2), (3, 4), (4, 5) as (a, b);
The following query produces incorrect results:
select t1.c1 as t1_c1, t2.c1 as t2_c1, b from t1 full outer join t2 using (c1), lateral ( select b from t3 where a = coalesce(t2.c1, 1) ) lt3; 1 1 2 NULL 3 4 Time taken: 2.395 seconds, Fetched 2 row(s) spark-sql (default)>
The result should be the following:
0 NULL 2 1 1 2 NULL 3 4