Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3567

Wrong result in a query with multiple window functions and different over clauses

    XMLWordPrintableJSON

Details

    Description

      0: jdbc:drill:drillbit=localhost> select * from t1;
      +-------+--------+-------------+
      |  a1   |   b1   |     c1      |
      +-------+--------+-------------+
      | 1     | aaaaa  | 2015-01-01  |
      | 2     | bbbbb  | 2015-01-02  |
      | 3     | ccccc  | 2015-01-03  |
      | 4     | null   | 2015-01-04  |
      | 5     | eeeee  | 2015-01-05  |
      | 6     | fffff  | 2015-01-06  |
      | 7     | ggggg  | 2015-01-07  |
      | null  | hhhhh  | 2015-01-08  |
      | 9     | iiiii  | null        |
      | 10    | jjjjj  | 2015-01-10  |
      +-------+--------+-------------+
      10 rows selected (0.078 seconds)
      

      Wrong result, columns are projected in the wrong order:

      0: jdbc:drill:drillbit=localhost> select
      . . . . . . . . . . . . . . . . >         count(*) over(partition by b1 order by c1) as count1,
      . . . . . . . . . . . . . . . . >         count(*) over(partition by a1 order by c1) as count2,
      . . . . . . . . . . . . . . . . >         sum(a1)  over(partition by b1 order by c1) as sum1
      . . . . . . . . . . . . . . . . > from 
      . . . . . . . . . . . . . . . . >         t1;
      +---------+---------+-------+
      | count1  | count2  | sum1  |
      +---------+---------+-------+
      | 1       | 1       | 1     |
      | 1       | 2       | 1     |
      | 1       | 3       | 1     |
      | 1       | 4       | 1     |
      | 1       | 5       | 1     |
      | 1       | 6       | 1     |
      | 1       | 7       | 1     |
      | 1       | 9       | 1     |
      | 1       | 10      | 1     |
      | 1       | null    | 1     |
      +---------+---------+-------+
      10 rows selected (0.113 seconds)
      

      Explain plan:

      0: jdbc:drill:drillbit=localhost> explain plan for select
      . . . . . . . . . . . . . . . . >         count(*) over(partition by b1 order by c1) as count1,
      . . . . . . . . . . . . . . . . >         count(*) over(partition by a1 order by c1) as count2,
      . . . . . . . . . . . . . . . . >         sum(a1)  over(partition by b1 order by c1) as sum1
      . . . . . . . . . . . . . . . . > from 
      . . . . . . . . . . . . . . . . >         t1;
      +------+------+
      | text | json |
      +------+------+
      | 00-00    Screen
      00-01      ProjectAllowDup(count1=[$0], count2=[$1], sum1=[$2])
      00-02        Project(w0$o0=[$4], w0$o1=[$5], w1$o0=[$6])
      00-03          Window(window#0=[window(partition {3} order by [2] range between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT()])])
      00-04            SelectionVectorRemover
      00-05              Sort(sort0=[$3], sort1=[$2], dir0=[ASC], dir1=[ASC])
      00-06                Window(window#0=[window(partition {1} order by [2] range between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT(), SUM($3)])])
      00-07                  SelectionVectorRemover
      00-08                    Sort(sort0=[$1], sort1=[$2], dir0=[ASC], dir1=[ASC])
      00-09                      Project(T61¦¦*=[$0], b1=[$1], c1=[$2], a1=[$3])
      00-10                        Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=file:/Users/vmarkman/drill/testdata/subqueries/t1]], selectionRoot=file:/Users/vmarkman/drill/testdata/subqueries/t1, numFiles=1, columns=[`*`]]])
      

      If you remove frame that is not the same as other two, query works correctly:

      0: jdbc:drill:drillbit=localhost> select
      . . . . . . . . . . . . . . . . >         count(*) over(partition by b1 order by c1) as count1,
      . . . . . . . . . . . . . . . . >         sum(a1)  over(partition by b1 order by c1) as sum1
      . . . . . . . . . . . . . . . . > from
      . . . . . . . . . . . . . . . . >         t1;
      +---------+-------+
      | count1  | sum1  |
      +---------+-------+
      | 1       | 1     |
      | 1       | 2     |
      | 1       | 3     |
      | 1       | 5     |
      | 1       | 6     |
      | 1       | 7     |
      | 1       | null  |
      | 1       | 9     |
      | 1       | 10    |
      | 1       | 4     |
      +---------+-------+
      10 rows selected (0.099 seconds)
      

      and in the different order (just for fun) :

      0: jdbc:drill:drillbit=localhost> select
      . . . . . . . . . . . . . . . . >         sum(a1)  over(partition by b1 order by c1) as sum1,
      . . . . . . . . . . . . . . . . >         count(*) over(partition by b1 order by c1) as count1
      . . . . . . . . . . . . . . . . > from 
      . . . . . . . . . . . . . . . . >         t1;
      +-------+---------+
      | sum1  | count1  |
      +-------+---------+
      | 1     | 1       |
      | 2     | 1       |
      | 3     | 1       |
      | 5     | 1       |
      | 6     | 1       |
      | 7     | 1       |
      | null  | 1       |
      | 9     | 1       |
      | 10    | 1       |
      | 4     | 1       |
      +-------+---------+
      10 rows selected (0.096 seconds)
      

      Attachments

        1. t1_parquet
          0.4 kB
          Victoria Markman

        Activity

          People

            seanhychu Sean Hsuan-Yi Chu
            vicky Victoria Markman
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: