Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
Impala 2.0
-
None
Description
Query: explain select a.* from functional.alltypes a cross join functional.alltypes b where not exists (select * from functional.alltypes c where a.id = c.id and a.int_col < b.int_col)
+-----------------------------------------------------------+
| Explain String |
+-----------------------------------------------------------+
| Estimated Per-Host Requirements: Memory=320.06MB VCores=2 |
| |
| 07:EXCHANGE [UNPARTITIONED] |
| | |
| 04:HASH JOIN [LEFT ANTI JOIN, BROADCAST] |
| | hash predicates: a.id = c.id |
| | |
| |--06:EXCHANGE [BROADCAST] |
| | | |
| | 02:SCAN HDFS [functional.alltypes c] |
| | partitions=24/24 size=478.45KB |
| | |
| 03:CROSS JOIN [BROADCAST] |
| | predicates: a.int_col < b.int_col |
| | |
| |--05:EXCHANGE [BROADCAST] |
| | | |
| | 01:SCAN HDFS [functional.alltypes b] |
| | partitions=24/24 size=478.45KB |
| | |
| 00:SCAN HDFS [functional.alltypes a] |
| partitions=24/24 size=478.45KB |
+-----------------------------------------------------------+
It's incorrect to assign "a.int_col < b.int_col" in the cross join because of NULLs. We also need to preserve those roes from the cross join where "(a.int_col < b.int_col) IS NULL" but the simpler fix is to just assign the predicate at the anti join.
Attachments
Issue Links
- relates to
-
IMPALA-1371 Predicate applied incorrectly when FULL OUTER JOIN is present
- Resolved