Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.20.3
-
None
-
None
Description
Currently, Drill ignores filters for filtered aggregate calls and returns incorrect results.
Here is the example query for which Drill will return incorrect results:
SELECT count(n_name) FILTER(WHERE n_regionkey = 1) AS nations_count_in_1_region, count(n_name) FILTER(WHERE n_regionkey = 2) AS nations_count_in_2_region, count(n_name) FILTER(WHERE n_regionkey = 3) AS nations_count_in_3_region, count(n_name) FILTER(WHERE n_regionkey = 4) AS nations_count_in_4_region, count(n_name) FILTER(WHERE n_regionkey = 0) AS nations_count_in_0_region FROM cp.`tpch/nation.parquet`
+---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+ | nations_count_in_1_region | nations_count_in_2_region | nations_count_in_3_region | nations_count_in_4_region | nations_count_in_0_region | +---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+ | 25 | 25 | 25 | 25 | 25 | +---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
But the correct result is
+---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+ | nations_count_in_1_region | nations_count_in_2_region | nations_count_in_3_region | nations_count_in_4_region | nations_count_in_0_region | +---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+ | 5 | 5 | 5 | 5 | 5 | +---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
Side note:
The query above could be rewritten using PIVOT:
SELECT `1` nations_count_in_1_region, `2` nations_count_in_2_region, `3` nations_count_in_3_region, `4` nations_count_in_4_region, `0` nations_count_in_0_region FROM (SELECT n_name, n_regionkey FROM cp.`tpch/nation.parquet`) PIVOT(count(n_name) FOR n_regionkey IN (0, 1, 2, 3, 4))
And will return correct results when this issue is fixed and Calcite is updated to 1.33.0