Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-46468

COUNT bug in lateral/exists subqueries

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 4.0.0
    • SQL

    Description

      Some further instances of a COUNT bug.

       

      One example is this test from join-lateral.sql

      https://github.com/apache/spark/blame/master/sql/core/src/test/resources/sql-tests/results/join-lateral.sql.out#L757

       

      According to PostgreSQL, the query should return 2 rows:

      c1 | c2 | sum

      --{}++{}{}{}---

        0 |  1 |   2

        1 |  2 |    NULL

       

      whereas Spark SQL only returns the first one.

       

      Similar instance is the following query, which should return 1 row from t1 but has an empty result now:

      create temporary view t1(c1, c2) as values (0, 1), (1, 2);
      create temporary view t2(c1, c2) as values (0, 2), (0, 3);

      SELECT tt1.c2
      FROM t1 as tt1
      WHERE tt1.c1 in (
      select max(tt2.c1)
      from t2 as tt2
       where tt1.c2 is null);

      Attachments

        Issue Links

          Activity

            People

              gubichev Andrey Gubichev
              gubichev Andrey Gubichev
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: