Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18578

Full outer join in correlated subquery returns incorrect results

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.0
    • 2.1.0
    • SQL

    Description

      Full outer join with a correlated predicate in the left operand in a subquery may return incorrect results.

      Example:

      Seq(1).toDF("c1").createOrReplaceTempView("t1")
      Seq(2).toDF("c1").createOrReplaceTempView("t2")
      Seq(1).toDF("c1").createOrReplaceTempView("t3")
      
      // Test case: 01 EXISTS subquery context
      // Expected result: 1 row
      // Actual result:   0 row
      sql("select * from t1 where exists (select 1 from (select c1 from t2 where t1.c1 = 2) t2 full join t3 on t2.c1=t3.c1)").show
      
      // Test case: 02 Scalar subquery context
      // Expected result: 1 row of value 1
      // Actual result:   1 row of NULL
      // Note adding the meaningless equi-join T1.c1=T2.c1 to get through the CROSS JOIN restriction
      sql("select (select max(1) from (select c1 from t2 where t1.c1 = 2 and t1.c1=t2.c1) t2 full join t3 on t2.c1=t3.c1) from t1").show
      

      Attachments

        Issue Links

          Activity

            People

              nsyca Nattavut Sutyanyong
              nsyca Nattavut Sutyanyong
              Herman van Hövell Herman van Hövell
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: