Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-15370

Some correlated subqueries return incorrect answers

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Resolved
    • 2.0.0
    • None
    • SQL
    • None

    Description

      The rewrite introduced in SPARK-14785 has the COUNT bug. The rewrite changes the semantics of some correlated subqueries when there are tuples from the outer query block that do not join with the subquery. For example:

      spark-sql> create table R(a integer) as values (1);
      spark-sql> create table S(b integer);
      spark-sql> select R.a from R 
               >     where (select count(*) from S where R.a = S.b) = 0;
      Time taken: 2.139 seconds                                                       
      spark-sql> 
      (returns zero rows; the answer should be one row of '1')
      

      This problem also affects the SELECT clause:

      spark-sql> select R.a, 
               >     (select count(*) from S where R.a = S.b) as cnt 
               > from R;
      1	NULL
      (the answer should be "1 0")
      

      Some subqueries with COUNT aggregates are not affected:

      spark-sql> select R.a from R 
               >     where (select count(*) from S where R.a = S.b) > 0;
      Time taken: 0.609 seconds
      spark-sql>
      (Correct answer)
      
      spark-sql> select R.a from R 
               >     where (select count(*) + sum(S.b) from S where R.a = S.b) = 0;
      Time taken: 0.553 seconds
      spark-sql> 
      (Correct answer)
      

      Other cases can trigger the variant of the COUNT bug for expressions involving NULL checks:

      spark-sql> select R.a from R 
               > where (select sum(S.b) is null from S where R.a = S.b);
      (returns zero rows, should return one row)
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              freiss Frederick Reiss
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: