Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-4512

Revisit the changes for DRILL-3404 (using SUM0 for window function)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.6.0
    • None
    • None

    Description

      DRILL-3404 was an incorrect results issue related to SUM0 window function over nullable column containing null values. The change done in Calcite for that issue should be reverted because on the latest master, after I revert the Calcite change, I still get the correct result. The Explain plan also shows that the new plan is different from the old one. It seems there may have been nullability related fix(es) on Calcite.

      New plan after reverting the change for DRILL-3404:

      00-00    Screen
      00-01      Project(c1=[$0], c2=[$1], w_sum=[$2])
      00-02        Project(c1=[$0], c2=[$1], w_sum=[CASE(>($2, 0), $3, null)])
      00-03          SelectionVectorRemover
      00-04            Filter(condition=[>($2, 0)])
      00-05              Window(window#0=[window(partition {1} order by [0 ASC-nulls-first] range between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT($0), $SUM0($0)])])
      00-06                SelectionVectorRemover
      00-07                  Sort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC-nulls-first])
      00-08                    Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=file:/Users/asinha/incubator-drill/exec/java-exec/src/test/resources/window/table_with_nulls.parquet]], selectionRoot=file:/Users/asinha/incubator-drill/exec/java-exec/src/test/resources/window/table_with_nulls.parquet, numFiles=1, usedMetadataFile=false, columns=[`c1`, `c2`]]])
      
      

      For reference, here's the old plan copied from DRILL-3404:

      | 00-00    Screen
      00-01      Project(c1=[$0], c2=[$1], w_sum=[$2])
      00-02        Project(c1=[$0], c2=[$1], w_sum=[CASE(>($2, 0), $3, null)])
      00-03          Window(window#0=[window(partition {1} order by [0 ASC-nulls-first] range between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT($0), $SUM0($0)])])
      00-04            SelectionVectorRemover
      00-05              Sort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC-nulls-first])
      00-06                Project(c1=[$1], c2=[$0])
      00-07                  Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tmp/tblWnulls]], selectionRoot=/tmp/tblWnulls, numFiles=1, columns=[`c1`, `c2`]]])
      

      Notice the two plans are different due to the extra filter condition present in the new plan.

      Attachments

        Activity

          People

            amansinha100 Aman Sinha
            amansinha100 Aman Sinha
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: