Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-732

Implement multiple distinct-COUNT using GROUPING SETS

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4.0-incubating
    • Component/s: None
    • Labels:
      None

      Description

      Currently if a query has COUNT(DISTINCT x) and COUNT(DISTINCT y) we compute the distinct counts separately and combine them using a join. The join isn't too expensive (because usually the GROUP BY has only a few keys) but we make multiple scans over the base table.

      I think we could translate multiple distinct-counts into a GROUPING SETS query (i.e. an Aggregate with more than one element in the groupSets field). If the underlying engine can evaluate that efficiently, then we have saved ourselves a join and several scans.

        Attachments

          Activity

            People

            • Assignee:
              julianhyde Julian Hyde
              Reporter:
              julianhyde Julian Hyde
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: