Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
1.1.0
-
4 node cluster on CentOS
Description
Filter is missing in the query plan for the below query in Drill, and hence wrong results are returned.
Results from Drill
0: jdbc:drill:schema=dfs.tmp> select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from `tblWnulls` ) sub_query where w_sum is not null; +-------------+-------+-------------+ | c1 | c2 | w_sum | +-------------+-------+-------------+ | 0 | a | 0 | | 1 | a | 1 | | 5 | a | 6 | | 10 | a | 16 | | 11 | a | 27 | | 14 | a | 41 | | 11111 | a | 11152 | | 2 | b | 2 | | 9 | b | 11 | | 13 | b | 24 | | 17 | b | 41 | | null | c | null | | 4 | c | 4 | | 6 | c | 10 | | 8 | c | 18 | | 12 | c | 30 | | 13 | c | 56 | | 13 | c | 56 | | null | d | null | | null | d | null | | 10 | d | 10 | | 11 | d | 21 | | 2147483647 | d | 4294967315 | | 2147483647 | d | 4294967315 | | -1 | e | -1 | | 15 | e | 14 | | null | null | null | | 19 | null | 19 | | 65536 | null | 65555 | | 1000000 | null | 1065555 | +-------------+-------+-------------+ 30 rows selected (0.337 seconds)
Explain plan for the above query from Drill
0: jdbc:drill:schema=dfs.tmp> explain plan for select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from `tblWnulls` ) sub_query where w_sum is not null; +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+ | text | json | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+ | 00-00 Screen 00-01 Project(c1=[$0], c2=[$1], w_sum=[$2]) 00-02 Project(c1=[$0], c2=[$1], w_sum=[CASE(>($2, 0), $3, null)]) 00-03 Window(window#0=[window(partition {1} order by [0 ASC-nulls-first] range between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT($0), $SUM0($0)])]) 00-04 SelectionVectorRemover 00-05 Sort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC-nulls-first]) 00-06 Project(c1=[$1], c2=[$0]) 00-07 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tmp/tblWnulls]], selectionRoot=/tmp/tblWnulls, numFiles=1, columns=[`c1`, `c2`]]])
Results from Postgres 9.3
postgres=# select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from t222 ) sub_query where w_sum is not null;
c1 | c2 | w_sum
------------+----+------------
0 | a | 0
1 | a | 1
5 | a | 6
10 | a | 16
11 | a | 27
14 | a | 41
11111 | a | 11152
2 | b | 2
9 | b | 11
13 | b | 24
17 | b | 41
4 | c | 4
6 | c | 10
8 | c | 18
12 | c | 30
13 | c | 56
13 | c | 56
10 | d | 10
11 | d | 21
2147483647 | d | 4294967315
2147483647 | d | 4294967315
-1 | e | -1
15 | e | 14
19 | | 19
65536 | | 65555
1000000 | | 1065555
(26 rows)