Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
Impala 2.5.0, Impala 2.4.0, Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, Impala 2.9.0
-
None
Description
Impala may generate an incorrect plan for queries that have a correlated scalar subquery as a parameter to a NULL-checking conditional function like ISNULL().
Example query and incorrect plan:
select t1.int_col
from functional.alltypessmall as t1
where t1.int_col >= isnull
(
(
SELECT
MAX(t2.bigint_col)
FROM
functional.alltypestiny AS t2
WHERE
t1.id = t2.id + 10000
),
0
)
Fetched 0 row(s) in 1.09s
Single-node plan:
+-----------------------------------------------------------------------+
| Explain String |
+-----------------------------------------------------------------------+
| Estimated Per-Host Requirements: Memory=0B VCores=0 |
| |
| PLAN-ROOT SINK |
| | |
| 03:HASH JOIN [LEFT SEMI JOIN] |
| | hash predicates: t1.id = t2.id + 10000 |
| | other join predicates: t1.int_col >= isnull(max(t2.bigint_col), 0) |
| | runtime filters: RF000 <- t2.id + 10000 |
| | |
| |--02:AGGREGATE [FINALIZE] |
| | | output: max(t2.bigint_col) |
| | | group by: t2.id |
| | | |
| | 01:SCAN HDFS [functional.alltypestiny t2] |
| | partitions=4/4 files=4 size=460B |
| | |
| 00:SCAN HDFS [functional.alltypessmall t1] |
| partitions=4/4 files=4 size=6.32KB |
| runtime filters: RF000 -> t1.id |
+-----------------------------------------------------------------------+
The query returns an empty result set but instead should return all rows from t1 because all invocations of the subquery return NULL, and all rows from t1 satisfy "t1.int_col >= 0".
Attachments
Issue Links
- is related to
-
IMPALA-7782 discrepancy in results with a subquery containing an agg that produces an empty set
-
- Resolved
-
-
IMPALA-5226 Add support for arbitrary subqueries in WHERE/HAVING, including inside disjuncts
-
- Open
-
- relates to
-
IMPALA-10382 Predicate with coalesce on both sides of LOJ isn't NULL filtering
-
- Resolved
-