Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-35553 Improve correlated subqueries
  3. SPARK-36115

Handle the COUNT bug for correlated IN/EXISTS subquery

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.2.0
    • None
    • SQL
    • None

    Description

      Correlated IN/EXISTS subqueries are also subject to the COUNT bug which is not handled.

      create view t1(c1, c2) as values (0, 1), (1, 2)
      create view t2(c1, c2) as values (0, 2), (0, 3)
      
      -- Example 1: IN subquery
      select * from t1 where c1 in (select count(*) + 1 from t2 where t1.c1 = t2.c1)
      -- Correct answer: (1, 2)
      +---+---+
      |c1 |c2 |
      +---+---+
      +---+---+
      
      -- Example 2: EXISTS subquery
      select * from t1 where exists (select count(*) from t2 where t1.c1 = t2.c1)
      -- Correct answer: [(0, 1), (1, 2)]
      +---+---+
      |c1 |c2 |
      +---+---+
      |0  |1  |
      +---+---+
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            allisonwang-db Allison Wang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: