Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
Impala 2.3.0, Impala 2.5.0
Description
Impala may return incorrect results for queries that have an outer join with an On-clause predicate that references at least two tables, but not the right-hand side of the join.
Example query to repro and its plan:
select a.id aid, b.id bid, a.int_col aint, b.int_col bint from functional.alltypes a inner join functional.alltypes b on a.int_col = b.int_col left outer join functional.alltypes c on a.id = b.id and b.bigint_col = c.bigint_col +-----------------------------------------------------------+ | Explain String | +-----------------------------------------------------------+ | Estimated Per-Host Requirements: Memory=320.08MB VCores=3 | | | | 08:EXCHANGE [UNPARTITIONED] | | | | | 04:HASH JOIN [LEFT OUTER JOIN, BROADCAST] | | | hash predicates: b.bigint_col = c.bigint_col <--- b.id = a.id should be here and not in join below| | | | | |--07:EXCHANGE [BROADCAST] | | | | | | | 02:SCAN HDFS [functional.alltypes c] | | | partitions=24/24 files=24 size=478.45KB | | | | | 03:HASH JOIN [INNER JOIN, PARTITIONED] | | | hash predicates: b.int_col = a.int_col, b.id = a.id | | | runtime filters: RF000 <- a.int_col, RF001 <- a.id | | | | | |--06:EXCHANGE [HASH(a.int_col,a.id)] | | | | | | | 00:SCAN HDFS [functional.alltypes a] | | | partitions=24/24 files=24 size=478.45KB | | | | | 05:EXCHANGE [HASH(b.int_col,b.id)] | | | | | 01:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | runtime filters: RF000 -> b.int_col, RF001 -> b.id | +-----------------------------------------------------------+