Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
1.5.0, 1.6.0, 2.0.0, 2.1.0, 2.2.0, 2.3.0
Description
The following query returns an incorrect answer:
set spark.sql.autoBroadcastJoinThreshold=-1; create table a as select * from values 1; create table b as select * from values 2; SELECT t3.col1, t1.col1 FROM a t1 CROSS JOIN b t2 CROSS JOIN b t3
This should return the row 2, 1 but instead it returns null, 1. If you permute the order of the columns in the select statement or the order of the joins then it returns a valid answer (i.e. one without incorrect NULLs).
This turns out to be due to two longstanding bugs in GenerateUnsafeRowJoiner, which I'll describe in more detail in my PR.