Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 2.5.0, Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, Impala 2.11.0, Impala 2.12.0
-
ghx-label-5
Description
Queries may return wrong results if an EXISTS subquery has an ORDER BY with a LIMIT and OFFSET clause. The EXISTS subquery may incorrectly evaluate to TRUE even though it s FALSE.
Reproduction:
select count(*) from functional.alltypestiny t where exists (select id from functional.alltypestiny where id < 5 order by id limit 10 offset 6);
The query should return "0" but it incorrectly returns "8" because an incorrect plan without the offset is generated. See plan:
+-------------------------------------------------+
| Explain String |
+-------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=0B |
| Per-Host Resource Estimates: Memory=84.00MB |
| Codegen disabled by planner |
| |
| PLAN-ROOT SINK |
| | |
| 08:AGGREGATE [FINALIZE] |
| | output: count:merge(*) |
| | |
| 07:EXCHANGE [UNPARTITIONED] |
| | |
| 04:AGGREGATE |
| | output: count(*) |
| | |
| 03:NESTED LOOP JOIN [LEFT SEMI JOIN, BROADCAST] |
| | |
| |--06:EXCHANGE [BROADCAST] |
| | | |
| | 05:MERGING-EXCHANGE [UNPARTITIONED] |
| | | order by: id ASC |
| | | limit: 1 |
| | | |
| | 02:TOP-N [LIMIT=1] |
| | | order by: id ASC |
| | | |
| | 01:SCAN HDFS [functional.alltypestiny] |
| | partitions=4/4 files=4 size=460B |
| | predicates: id < 5 |
| | |
| 00:SCAN HDFS [functional.alltypestiny t] |
| partitions=4/4 files=4 size=460B |
+-------------------------------------------------+
Evaluating the subquery by itself gives the expected results:
select id from functional.alltypestiny where id < 5 order by id limit 10 offset 6; <empty result set>