Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-733

Multiple distinct-COUNT query gives wrong results

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3.0-incubating
    • Component/s: None
    • Labels:
      None

      Description

      The query

      select "department_id" as d, count(distinct "education_level") as c1, count(distinct "gender") as c2 from foodmart_clone."employee" group by "department_id";

      returns 0 rows and should return 12. In the plan

      EnumerableCalc(expr#0..3=[{inputs}], department_id=[$t2], C1=[$t3], C2=[$t1])
        EnumerableJoin(condition=[=($0, $2)], joinType=[inner])
          EnumerableAggregate(group=[{0}], groups=[[{7}]], C2=[COUNT($1)])
            EnumerableAggregate(group=[{7, 15}])
              EnumerableTableScan(table=[[FOODMART_CLONE, employee]])
          EnumerableAggregate(group=[{0}], groups=[[{7}]], C1=[COUNT($1)])
            EnumerableAggregate(group=[{7, 13}])
              EnumerableTableScan(table=[[FOODMART_CLONE, employee]])
      

      you can see

      group=[{0}], groups=[[{7}]]

      and this is wrong – the groups should be made up of the same bits as the group. We should add an assert on this invariant and fixing it will probably cause the plan to return the right results.

        Attachments

          Activity

            People

            • Assignee:
              julianhyde Julian Hyde
              Reporter:
              julianhyde Julian Hyde
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: