-
Type:
Bug
-
Status: Closed
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: 1.4.0-incubating
-
Component/s: None
-
Labels:None
Currently if a query has COUNT(DISTINCT x) and COUNT(DISTINCT y) we compute the distinct counts separately and combine them using a join. The join isn't too expensive (because usually the GROUP BY has only a few keys) but we make multiple scans over the base table.
I think we could translate multiple distinct-counts into a GROUPING SETS query (i.e. an Aggregate with more than one element in the groupSets field). If the underlying engine can evaluate that efficiently, then we have saved ourselves a join and several scans.