Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-45507

Correctness bug in correlated scalar subqueries with COUNT aggregates

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

       
      create view if not exists t1(a1, a2) as values (0, 1), (1, 2);
      create view if not exists t2(b1, b2) as values (0, 2), (0, 3);
      create view if not exists t3(c1, c2) as values (0, 2), (0, 3);
      
      -- Example 1
      select (
        select SUM(l.cnt + r.cnt)
        from (select count(*) cnt from t2 where t1.a1 = t2.b1 having cnt = 0) l
        join (select count(*) cnt from t3 where t1.a1 = t3.c1 having cnt = 0) r
        on l.cnt = r.cnt
      ) from t1
      
      -- Correct answer: (null, 0)
      +----------------------+
      |scalarsubquery(c1, c1)|
      +----------------------+
      |null                  |
      |null                  |
      +----------------------+
      
      -- Example 2
      select ( select sum(cnt) from (select count(*) cnt from t2 where t1.c1 = t2.c1) ) from t1
      
      -- Correct answer: (2, 0)
      +------------------+
      |scalarsubquery(c1)|
      +------------------+
      |2                 |
      |null              |
      +------------------+
      
      -- Example 3
      select ( select count(*) from (select count(*) cnt from t2 where t1.c1 = t2.c1) ) from t1
      
      -- Correct answer: (1, 1)
      +------------------+
      |scalarsubquery(c1)|
      +------------------+
      |1                 |
      |0                 |
      +------------------+ 

       

       

      DB fiddle for correctness check:https://www.db-fiddle.com/f/4jyoMCicNSZpjMt4jFYoz5/10403#

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            andyylam Andy Lam
            andyylam Andy Lam
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment