Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-8381

Add support for filtered aggregate calls

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.20.3
    • 1.21.0
    • None
    • None

    Description

      Currently, Drill ignores filters for filtered aggregate calls and returns incorrect results.
      Here is the example query for which Drill will return incorrect results:

      SELECT count(n_name) FILTER(WHERE n_regionkey = 1) AS nations_count_in_1_region,
      count(n_name) FILTER(WHERE n_regionkey = 2) AS nations_count_in_2_region,
      count(n_name) FILTER(WHERE n_regionkey = 3) AS nations_count_in_3_region,
      count(n_name) FILTER(WHERE n_regionkey = 4) AS nations_count_in_4_region,
      count(n_name) FILTER(WHERE n_regionkey = 0) AS nations_count_in_0_region
      FROM cp.`tpch/nation.parquet`
      
      +---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
      | nations_count_in_1_region | nations_count_in_2_region | nations_count_in_3_region | nations_count_in_4_region | nations_count_in_0_region |
      +---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
      | 25                        | 25                        | 25                        | 25                        | 25                        |
      +---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
      

      But the correct result is

      +---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
      | nations_count_in_1_region | nations_count_in_2_region | nations_count_in_3_region | nations_count_in_4_region | nations_count_in_0_region |
      +---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
      | 5                         | 5                         | 5                         | 5                         | 5                         |
      +---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
      

      Side note:
      The query above could be rewritten using PIVOT:

      SELECT `1` nations_count_in_1_region, `2` nations_count_in_2_region, `3` nations_count_in_3_region, `4` nations_count_in_4_region, `0` nations_count_in_0_region
      FROM (SELECT n_name, n_regionkey FROM cp.`tpch/nation.parquet`) 
      PIVOT(count(n_name) FOR n_regionkey IN (0, 1, 2, 3, 4))
      

      And will return correct results when this issue is fixed and Calcite is updated to 1.33.0

      Attachments

        Activity

          People

            volodymyr Vova Vysotskyi
            volodymyr Vova Vysotskyi
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: