Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Currently if a query has COUNT(DISTINCT x) and COUNT(DISTINCT y) we compute the distinct counts separately and combine them using a join. The join isn't too expensive (because usually the GROUP BY has only a few keys) but we make multiple scans over the base table.
I think we could translate multiple distinct-counts into a GROUPING SETS query (i.e. an Aggregate with more than one element in the groupSets field). If the underlying engine can evaluate that efficiently, then we have saved ourselves a join and several scans.