Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 2.12.0, Impala 3.1.0
-
ghx-label-7
Description
A discrepancy exists between Impala and Postgres when a subquery contains an agg and results in an empty set, yet the WHERE clause looking at the subquery should produce a "True" condition.
Example queries include:
USE functional; SELECT id FROM alltypestiny WHERE -1 NOT IN (SELECT COUNT(id) FROM alltypestiny HAVING false); SELECT id FROM alltypestiny WHERE NULL NOT IN (SELECT COUNT(id) FROM alltypestiny HAVING false); SELECT id FROM alltypestiny WHERE (SELECT COUNT(id) FROM alltypestiny HAVING false) IS NULL;
These queries do not produce any rows in Impala. In Postgres, the queries produce all 8 rows for the functional.alltypestiny id column.
Thinking maybe there were Impala and Postgres differences with NOT IN behavior, I also tried this:
USE functional; SELECT id FROM alltypestiny WHERE -1 NOT IN (SELECT 1 FROM alltypestiny WHERE bool_col IS NULL);
This subquery also produces an empty set just like the subquery in the problematic queries at the top, but unlike those queries, this full query returns the same results in Impala and Postgres (all 8 rows for the functional.alltypestiny id column).
For anyone interested in this bug, you can migrate data into postgres in a dev environment using
tests/comparison/data_generator.py --use-postgresql --migrate-table-names alltypestiny --db-name functional migrate
This is in 2.12 at least, so it's not a 3.1 regression.
Attachments
Issue Links
- relates to
-
IMPALA-4373 Wrong results with correlated WHERE-clause subquery inside a NULL-checking conditional function.
- Open
-
IMPALA-9949 Subqueries in select can result in rows not being returned
- Resolved