Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3404

Filter on window function does not appear in query plan

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Filter is missing in the query plan for the below query in Drill, and hence wrong results are returned.

      Results from Drill

      0: jdbc:drill:schema=dfs.tmp> select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from `tblWnulls` ) sub_query where w_sum is not null;
      +-------------+-------+-------------+
      |     c1      |  c2   |    w_sum    |
      +-------------+-------+-------------+
      | 0           | a     | 0           |
      | 1           | a     | 1           |
      | 5           | a     | 6           |
      | 10          | a     | 16          |
      | 11          | a     | 27          |
      | 14          | a     | 41          |
      | 11111       | a     | 11152       |
      | 2           | b     | 2           |
      | 9           | b     | 11          |
      | 13          | b     | 24          |
      | 17          | b     | 41          |
      | null        | c     | null        |
      | 4           | c     | 4           |
      | 6           | c     | 10          |
      | 8           | c     | 18          |
      | 12          | c     | 30          |
      | 13          | c     | 56          |
      | 13          | c     | 56          |
      | null        | d     | null        |
      | null        | d     | null        |
      | 10          | d     | 10          |
      | 11          | d     | 21          |
      | 2147483647  | d     | 4294967315  |
      | 2147483647  | d     | 4294967315  |
      | -1          | e     | -1          |
      | 15          | e     | 14          |
      | null        | null  | null        |
      | 19          | null  | 19          |
      | 65536       | null  | 65555       |
      | 1000000     | null  | 1065555     |
      +-------------+-------+-------------+
      30 rows selected (0.337 seconds)
      

      Explain plan for the above query from Drill

      0: jdbc:drill:schema=dfs.tmp> explain plan for select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from `tblWnulls` ) sub_query where w_sum is not null;
      +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
      |                                                                                                                                                                                                                                                                                                                                       text                                                                                                                                                                                                                                                                                                                                       | json  |
      +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
      | 00-00    Screen
      00-01      Project(c1=[$0], c2=[$1], w_sum=[$2])
      00-02        Project(c1=[$0], c2=[$1], w_sum=[CASE(>($2, 0), $3, null)])
      00-03          Window(window#0=[window(partition {1} order by [0 ASC-nulls-first] range between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT($0), $SUM0($0)])])
      00-04            SelectionVectorRemover
      00-05              Sort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC-nulls-first])
      00-06                Project(c1=[$1], c2=[$0])
      00-07                  Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tmp/tblWnulls]], selectionRoot=/tmp/tblWnulls, numFiles=1, columns=[`c1`, `c2`]]])
      

      Results from Postgres 9.3

      postgres=# select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from t222 ) sub_query where w_sum is not null;
           c1     | c2 |   w_sum    
      ------------+----+------------
                0 | a  |          0
                1 | a  |          1
                5 | a  |          6
               10 | a  |         16
               11 | a  |         27
               14 | a  |         41
            11111 | a  |      11152
                2 | b  |          2
                9 | b  |         11
               13 | b  |         24
               17 | b  |         41
                4 | c  |          4
                6 | c  |         10
                8 | c  |         18
               12 | c  |         30
               13 | c  |         56
               13 | c  |         56
               10 | d  |         10
               11 | d  |         21
       2147483647 | d  | 4294967315
       2147483647 | d  | 4294967315
               -1 | e  |         -1
               15 | e  |         14
               19 |    |         19
            65536 |    |      65555
          1000000 |    |    1065555
      (26 rows)
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            amansinha100 Aman Sinha
            khfaraaz Khurram Faraaz
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment