Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
None
-
ghx-label-12
Description
IMPALA-8954 added support for uncorrelated subqueries but some do not return correct results. Both of those queries should return rows with NULLs where the subquery returned 0 rows.
[localhost.EXAMPLE.COM:21000] default> select (select min(int_col) from functional.alltypes having min(int_col) < 0) from functional.alltypestiny; Fetched 0 row(s) in 0.16s [localhost.EXAMPLE.COM:21000] default> select (select min(int_col) from functional.alltypes limit 0) from functional.alltypestiny; Fetched 0 row(s) in 0.14s
The problem is that the CROSS JOIN will return 0 rows if the subquery returns 0 rows.
[localhost.EXAMPLE.COM:21000] default> explain select (select min(int_col) from functional.alltypes having min(int_col) < 0) from functional.alltypestiny; Query: explain select (select min(int_col) from functional.alltypes having min(int_col) < 0) from functional.alltypestiny +-------------------------------------------------------------+ | Explain String | +-------------------------------------------------------------+ | Max Per-Host Resource Reservation: Memory=40.00KB Threads=5 | | Per-Host Resource Estimates: Memory=180MB | | Codegen disabled by planner | | | | PLAN-ROOT SINK | | | | | 03:NESTED LOOP JOIN [CROSS JOIN, BROADCAST] | | | row-size=4B cardinality=8 | | | | | |--06:EXCHANGE [UNPARTITIONED] | | | | | | | 00:SCAN HDFS [functional.alltypestiny] | | | HDFS partitions=4/4 files=4 size=460B | | | row-size=0B cardinality=8 | | | | | 05:AGGREGATE [FINALIZE] | | | output: min:merge(int_col) | | | having: min(int_col) < 0 | | | row-size=4B cardinality=1 | | | | | 04:EXCHANGE [UNPARTITIONED] | | | | | 02:AGGREGATE | | | output: min(int_col) | | | row-size=4B cardinality=1 | | | | | 01:SCAN HDFS [functional.alltypes] | | HDFS partitions=24/24 files=24 size=478.45KB | | row-size=4B cardinality=7.30K | +-------------------------------------------------------------+ Fetched 29 row(s) in 0.04s
We need to detect cases where the subquery can return 0 rows and instead insert a left outer join.
I did this in a patch and it fixed the issue.
Attachments
Issue Links
- is broken by
-
IMPALA-8954 Support uncorrelated subqueries in the select list
- Resolved
- is related to
-
IMPALA-7782 discrepancy in results with a subquery containing an agg that produces an empty set
- Resolved