[SPARK-15370] Some correlated subqueries return incorrect answers - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Resolved
Affects Version/s: 2.0.0
Fix Version/s: None
Component/s: SQL
Labels:
None

Target Version/s:

2.0.0

Description

The rewrite introduced in ~~SPARK-14785~~ has the COUNT bug. The rewrite changes the semantics of some correlated subqueries when there are tuples from the outer query block that do not join with the subquery. For example:

spark-sql> create table R(a integer) as values (1);
spark-sql> create table S(b integer);
spark-sql> select R.a from R 
         >     where (select count(*) from S where R.a = S.b) = 0;
Time taken: 2.139 seconds                                                       
spark-sql> 
(returns zero rows; the answer should be one row of '1')

This problem also affects the SELECT clause:

spark-sql> select R.a, 
         >     (select count(*) from S where R.a = S.b) as cnt 
         > from R;
1	NULL
(the answer should be "1 0")

Some subqueries with COUNT aggregates are not affected:

spark-sql> select R.a from R 
         >     where (select count(*) from S where R.a = S.b) > 0;
Time taken: 0.609 seconds
spark-sql>
(Correct answer)

spark-sql> select R.a from R 
         >     where (select count(*) + sum(S.b) from S where R.a = S.b) = 0;
Time taken: 0.553 seconds
spark-sql> 
(Correct answer)

Other cases can trigger the variant of the COUNT bug for expressions involving NULL checks:

spark-sql> select R.a from R 
         > where (select sum(S.b) is null from S where R.a = S.b);
(returns zero rows, should return one row)

Attachments

Issue Links

relates to

SPARK-18455 General support for correlated subquery processing

Resolved

links to

[Github] Pull Request #13155 (frreiss)

[Github] Pull Request #13626 (hvanhovell)

[Github] Pull Request #13629 (hvanhovell)

Activity

People

Assignee:: Unassigned

Reporter:: Frederick Reiss

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 17/May/16 21:09

Updated:: 15/Nov/16 22:22

Resolved:: 12/Jun/16 21:23