Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-2543

Correlated subquery where outer table contains NULL values returns seemingly wrong result

    XMLWordPrintableJSON

Details

    Description

      0: jdbc:drill:schema=dfs> select * from t1;
      +------------+------------+------------+
      |     a1     |     b1     |     c1     |
      +------------+------------+------------+
      | 1          | 2015-03-01 | aaaaa      |
      | 2          | 2015-03-02 | bbbbb      |
      | null       | null       | null       |
      +------------+------------+------------+
      3 rows selected (0.064 seconds)
      
      0: jdbc:drill:schema=dfs> select * from t2;
      +------------+------------+------------+
      |     a2     |     b2     |     c2     |
      +------------+------------+------------+
      | 5          | 2017-03-01 | a          |
      +------------+------------+------------+
      1 row selected (0.07 seconds)
      
      0: jdbc:drill:schema=dfs> select t1.c1, count(*) from t1 where t1.b1 not in (select b2 from t2 where t1.a1 = t2.a2) group by t1.c1 order by t1.c1;
      +------------+------------+
      |     c1     |   EXPR$1   |
      +------------+------------+
      | aaaaa      | 1          |
      | bbbbb      | 1          |
      +------------+------------+
      2 rows selected (0.32 seconds)
      

      Postgres returns row from the outer table where a1 is null.
      This is part that I don't understand, because join condition in the subquery should have eliminated row where a1 IS NULL. To me Drill result looks correct. Unless there is something different in correlated comparison semantics that I'm not aware of.

      postgres=# select * from t1;
       a1 |     b1     |  c1
      ----+------------+-------
        1 | 2015-03-01 | aaaaa
        2 | 2015-03-02 | bbbbb
          |            |
      (3 rows)
      

      Explain plan for the query:

      00-01      Project(c1=[$0], EXPR$1=[$1])
      00-02        StreamAgg(group=[{0}], EXPR$1=[COUNT()])
      00-03          Sort(sort0=[$0], dir0=[ASC])
      00-04            Project(c1=[$0])
      00-05              SelectionVectorRemover
      00-06                Filter(condition=[NOT(IS TRUE($3))])
      00-07                  HashJoin(condition=[=($1, $2)], joinType=[left])
      00-09                    Project($f1=[$0], $f3=[$2])
      00-11                      SelectionVectorRemover
      00-13                        Filter(condition=[IS NOT NULL($1)])
      00-15                          Project(c1=[$1], b1=[$0], a1=[$2])
      00-17                            Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/test/t1]], selectionRoot=/test/t1, numFiles=1, columns=[`c1`, `b1`, `a1`]]])
      00-08                    Project($f02=[$1], $f2=[$2])
      00-10                      StreamAgg(group=[{0, 1}], agg#0=[MIN($2)])
      00-12                        Sort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])
      00-14                          Project($f0=[$1], $f02=[$2], $f1=[true])
      00-16                            HashJoin(condition=[=($2, $0)], joinType=[inner])
      00-18                              StreamAgg(group=[{0}])
      00-20                                Sort(sort0=[$0], dir0=[ASC])
      00-22                                  Project($f0=[$1])
      00-23                                    SelectionVectorRemover
      00-24                                      Filter(condition=[IS NOT NULL($0)])
      00-25                                        Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/test/t1]], selectionRoot=/test/t1, numFiles=1, columns=[`b1`, `a1`]]])
      00-19                              Project(a2=[$1], b2=[$0])
      00-21                                Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/test/t2]], selectionRoot=/test/t2, numFiles=1, columns=[`a2`, `b2`]]])
      

      Attachments

        Activity

          People

            jni Jinfeng Ni
            vicky Victoria Markman
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: