Details
Description
Example # 1 (sort merge join):
create or replace temp view v1 as select * from values (1, 1), (2, 2), (3, 1) as v1(key, value); create or replace temp view v2 as select * from values (1, 22, 22), (3, -1, -1), (7, null, null) as v2(a, b, c); select * from v1 full outer join v2 on key = a and value > b and value > c;
The join's generated code causes the following compilation error:
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 277, Column 9: Redefinition of local variable "smj_isNull_7"
Example #2 (shuffle hash join):
select /*+ SHUFFLE_HASH(v2) */ * from v1 full outer join v2 on key = a and value > b and value > c;
The shuffle hash join's generated code causes the following compilation error:
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 174, Column 5: Redefinition of local variable "shj_value_1"
With default configuration, both queries end up succeeding, since Spark falls back to running each query with whole-stage codegen disabled.
The issue happens only when the join's bound condition refers to the same stream-side column more than once.