[DRILL-3404] Filter on window function does not appear in query plan - ASF JIRA

Details

Type: Bug
Status: Closed
Priority: Critical
Resolution: Fixed
Affects Version/s: 1.1.0
Fix Version/s: 1.1.0
Component/s: Query Planning & Optimization
Labels:
- window_function
Environment:

4 node cluster on CentOS

Description

Filter is missing in the query plan for the below query in Drill, and hence wrong results are returned.

Results from Drill

0: jdbc:drill:schema=dfs.tmp> select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from `tblWnulls` ) sub_query where w_sum is not null;
+-------------+-------+-------------+
|     c1      |  c2   |    w_sum    |
+-------------+-------+-------------+
| 0           | a     | 0           |
| 1           | a     | 1           |
| 5           | a     | 6           |
| 10          | a     | 16          |
| 11          | a     | 27          |
| 14          | a     | 41          |
| 11111       | a     | 11152       |
| 2           | b     | 2           |
| 9           | b     | 11          |
| 13          | b     | 24          |
| 17          | b     | 41          |
| null        | c     | null        |
| 4           | c     | 4           |
| 6           | c     | 10          |
| 8           | c     | 18          |
| 12          | c     | 30          |
| 13          | c     | 56          |
| 13          | c     | 56          |
| null        | d     | null        |
| null        | d     | null        |
| 10          | d     | 10          |
| 11          | d     | 21          |
| 2147483647  | d     | 4294967315  |
| 2147483647  | d     | 4294967315  |
| -1          | e     | -1          |
| 15          | e     | 14          |
| null        | null  | null        |
| 19          | null  | 19          |
| 65536       | null  | 65555       |
| 1000000     | null  | 1065555     |
+-------------+-------+-------------+
30 rows selected (0.337 seconds)

Explain plan for the above query from Drill

0: jdbc:drill:schema=dfs.tmp> explain plan for select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from `tblWnulls` ) sub_query where w_sum is not null;
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
|                                                                                                                                                                                                                                                                                                                                       text                                                                                                                                                                                                                                                                                                                                       | json  |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| 00-00    Screen
00-01      Project(c1=[$0], c2=[$1], w_sum=[$2])
00-02        Project(c1=[$0], c2=[$1], w_sum=[CASE(>($2, 0), $3, null)])
00-03          Window(window#0=[window(partition {1} order by [0 ASC-nulls-first] range between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT($0), $SUM0($0)])])
00-04            SelectionVectorRemover
00-05              Sort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC-nulls-first])
00-06                Project(c1=[$1], c2=[$0])
00-07                  Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tmp/tblWnulls]], selectionRoot=/tmp/tblWnulls, numFiles=1, columns=[`c1`, `c2`]]])

Results from Postgres 9.3

postgres=# select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from t222 ) sub_query where w_sum is not null;
     c1     | c2 |   w_sum    
------------+----+------------
          0 | a  |          0
          1 | a  |          1
          5 | a  |          6
         10 | a  |         16
         11 | a  |         27
         14 | a  |         41
      11111 | a  |      11152
          2 | b  |          2
          9 | b  |         11
         13 | b  |         24
         17 | b  |         41
          4 | c  |          4
          6 | c  |         10
          8 | c  |         18
         12 | c  |         30
         13 | c  |         56
         13 | c  |         56
         10 | d  |         10
         11 | d  |         21
 2147483647 | d  | 4294967315
 2147483647 | d  | 4294967315
         -1 | e  |         -1
         15 | e  |         14
         19 |    |         19
      65536 |    |      65555
    1000000 |    |    1065555
(26 rows)

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

0_0_0.parquet
26/Jun/15 21:53
0.4 kB
Khurram Faraaz
0001-Don-t-use-SumEmptyIsZero-SUM0-window-aggregate-until.patch
30/Jun/15 16:58
1 kB
Aman Sinha
0001-DRILL-3304-Use-SUM-instead-of-SUM0-for-window-aggreg.patch
30/Jun/15 20:17
3 kB
Aman Sinha

Filter on window function does not appear in query plan

Details

Description

Attachments

Attachments

Activity

People

Dates