Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-1805

Druid adapter incorrectly pushes down "COUNT(c)"; Druid only supports "COUNT(*)"

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.12.0
    • Fix Version/s: 1.13.0
    • Component/s: druid
    • Labels:
      None

      Description

      Currently queries like

      select count(column) from table

      is pushed to druid as timeseries with an aggregator

       {"type":"count","name":"EXPR$0","fieldName":"countryName"} 

      Such an aggregator does not exist in druid. The count aggregator does only work as

       count(*) 

      .
      here is a test case that summarize the issue.

        @Test public void testCount() {
      
          final String sql = "SELECT count(\"countryName\") FROM (SELECT \"countryName\" FROM \"wikiticker\" WHERE \"countryName\"  IS NOT NULL) as a"; // correct count
          sql(sql, WIKI_AUTO2).returnsUnordered("EXPR$0=3799");
      
          final String sql2 = "SELECT count(\"countryName\") FROM (SELECT \"countryName\" FROM \"wikiticker\") as a";
          sql(sql2, WIKI_AUTO2).returnsUnordered("EXPR$0=3799"); // it will fail
        }

      First test will pass while the second will not.

        Issue Links

          Activity

          Hide
          julianhyde Julian Hyde added a comment -

          slim bouguerra, Is Druid able to execute the equivalent of the following?

          select sum(case when "countryName" is null then 0 else 1 end) from "wikiticker"
          Show
          julianhyde Julian Hyde added a comment - slim bouguerra , Is Druid able to execute the equivalent of the following? select sum(case when "countryName" is null then 0 else 1 end) from "wikiticker"
          Hide
          bslim slim bouguerra added a comment -

          The example you gave yes but not an arbitrary one like the example below. Your example can be written a count of number of rows returned by group by countryName where countryName not null.

           
          select sum(case when "countryName" is null then -1 else 1 end) from "wikiticker"
          
          Show
          bslim slim bouguerra added a comment - The example you gave yes but not an arbitrary one like the example below. Your example can be written a count of number of rows returned by group by countryName where countryName not null. select sum( case when "countryName" is null then -1 else 1 end) from "wikiticker"
          Hide
          julianhyde Julian Hyde added a comment -

          I would propose to translate count(c) to sum(case when c is null then 0 else 1 end), and since Druid can handle the latter, we're good.

          Show
          julianhyde Julian Hyde added a comment - I would propose to translate count(c) to sum(case when c is null then 0 else 1 end) , and since Druid can handle the latter, we're good.
          Hide
          bslim slim bouguerra added a comment - - edited

          Julian Hyde doing what you suggest needs more ground work to support nested druid queries.
          Thus it is better for the moment to avoid pushing such aggregate to druid.
          I have this PR
          all the tests are passing beside `DruidAdapterIT.testGroupByAvgSumCount` that is failing with an NPE exception any clue ?

           
          java.lang.RuntimeException: exception while executing [select "state_province",
           avg("unit_sales") as a,
           sum("unit_sales") as s,
           count("store_sqft") as c,
           count(*) as c0
          from "foodmart"
          group by "state_province"
          order by 1]
          
          	at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1270)
          	at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1249)
          	at org.apache.calcite.test.CalciteAssert$AssertQuery.returnsUnordered(CalciteAssert.java:1276)
          	at org.apache.calcite.test.DruidAdapterIT.testGroupByAvgSumCount(DruidAdapterIT.java:1082)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          	at java.lang.reflect.Method.invoke(Method.java:498)
          	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
          	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
          	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
          	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
          	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
          	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
          	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
          	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
          	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
          	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
          	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
          	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
          	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
          	at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
          	at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
          	at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
          	at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
          	at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
          Caused by: java.lang.RuntimeException: With materializationsEnabled=false, limit=2
          	at org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:549)
          	at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1266)
          	... 25 more
          Caused by: java.sql.SQLException: Error while executing SQL "select "state_province",
           avg("unit_sales") as a,
           sum("unit_sales") as s,
           count("store_sqft") as c,
           count(*) as c0
          from "foodmart"
          group by "state_province"
          order by 1": null
          	at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
          	at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
          	at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
          	at org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:218)
          	at org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:517)
          	... 26 more
          Caused by: java.lang.NullPointerException
          	at org.apache.calcite.interpreter.AggregateNode.getAccumulator(AggregateNode.java:144)
          	at org.apache.calcite.interpreter.AggregateNode.<init>(AggregateNode.java:86)
          	at org.apache.calcite.interpreter.Nodes$CoreCompiler.visit(Nodes.java:47)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          	at java.lang.reflect.Method.invoke(Method.java:498)
          	at org.apache.calcite.util.ReflectUtil.invokeVisitorInternal(ReflectUtil.java:257)
          	at org.apache.calcite.util.ReflectUtil.invokeVisitor(ReflectUtil.java:214)
          	at org.apache.calcite.util.ReflectUtil$1.invokeVisitor(ReflectUtil.java:465)
          	at org.apache.calcite.interpreter.Interpreter$Compiler.visit(Interpreter.java:479)
          	at org.apache.calcite.rel.SingleRel.childrenAccept(SingleRel.java:72)
          	at org.apache.calcite.interpreter.Interpreter$Compiler.visit(Interpreter.java:475)
          	at org.apache.calcite.rel.SingleRel.childrenAccept(SingleRel.java:72)
          	at org.apache.calcite.interpreter.Interpreter$Compiler.visit(Interpreter.java:475)
          	at org.apache.calcite.interpreter.Interpreter$Compiler.visitRoot(Interpreter.java:436)
          	at org.apache.calcite.interpreter.Interpreter.<init>(Interpreter.java:78)
          	at Baz.bind(Unknown Source)
          	at org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:335)
          	at org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:294)
          	at org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:559)
          	at org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:550)
          	at org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:193)
          	at org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67)
          	at org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44)
          	at org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:607)
          	at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:607)
          	at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:615)
          	at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:148)
          	... 28 more
          
          Show
          bslim slim bouguerra added a comment - - edited Julian Hyde doing what you suggest needs more ground work to support nested druid queries. Thus it is better for the moment to avoid pushing such aggregate to druid. I have this PR all the tests are passing beside `DruidAdapterIT.testGroupByAvgSumCount` that is failing with an NPE exception any clue ? java.lang.RuntimeException: exception while executing [select "state_province" , avg( "unit_sales" ) as a, sum( "unit_sales" ) as s, count( "store_sqft" ) as c, count(*) as c0 from "foodmart" group by "state_province" order by 1] at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1270) at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1249) at org.apache.calcite.test.CalciteAssert$AssertQuery.returnsUnordered(CalciteAssert.java:1276) at org.apache.calcite.test.DruidAdapterIT.testGroupByAvgSumCount(DruidAdapterIT.java:1082) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51) at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242) at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) Caused by: java.lang.RuntimeException: With materializationsEnabled= false , limit=2 at org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:549) at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1266) ... 25 more Caused by: java.sql.SQLException: Error while executing SQL "select " state_province", avg( "unit_sales" ) as a, sum( "unit_sales" ) as s, count( "store_sqft" ) as c, count(*) as c0 from "foodmart" group by "state_province" order by 1": null at org.apache.calcite.avatica.Helper.createException(Helper.java:56) at org.apache.calcite.avatica.Helper.createException(Helper.java:41) at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) at org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:218) at org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:517) ... 26 more Caused by: java.lang.NullPointerException at org.apache.calcite.interpreter.AggregateNode.getAccumulator(AggregateNode.java:144) at org.apache.calcite.interpreter.AggregateNode.<init>(AggregateNode.java:86) at org.apache.calcite.interpreter.Nodes$CoreCompiler.visit(Nodes.java:47) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.calcite.util.ReflectUtil.invokeVisitorInternal(ReflectUtil.java:257) at org.apache.calcite.util.ReflectUtil.invokeVisitor(ReflectUtil.java:214) at org.apache.calcite.util.ReflectUtil$1.invokeVisitor(ReflectUtil.java:465) at org.apache.calcite.interpreter.Interpreter$ Compiler .visit(Interpreter.java:479) at org.apache.calcite.rel.SingleRel.childrenAccept(SingleRel.java:72) at org.apache.calcite.interpreter.Interpreter$ Compiler .visit(Interpreter.java:475) at org.apache.calcite.rel.SingleRel.childrenAccept(SingleRel.java:72) at org.apache.calcite.interpreter.Interpreter$ Compiler .visit(Interpreter.java:475) at org.apache.calcite.interpreter.Interpreter$ Compiler .visitRoot(Interpreter.java:436) at org.apache.calcite.interpreter.Interpreter.<init>(Interpreter.java:78) at Baz.bind(Unknown Source) at org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:335) at org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:294) at org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:559) at org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:550) at org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:193) at org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67) at org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44) at org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:607) at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:607) at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:615) at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:148) ... 28 more
          Hide
          julianhyde Julian Hyde added a comment -

          Looks like CALCITE-1687.

          Show
          julianhyde Julian Hyde added a comment - Looks like CALCITE-1687 .
          Hide
          bslim slim bouguerra added a comment -

          Julian Hyde thanks, should i disable the test till that we fix that bug ?

          Show
          bslim slim bouguerra added a comment - Julian Hyde thanks, should i disable the test till that we fix that bug ?
          Hide
          julianhyde Julian Hyde added a comment -

          No, we make this bug depend on that bug. Neither issue is fixed right now. Disabling the test case would just masks that fact.

          By the way, what do you mean by "nested druid queries"? I didn't think it was possible to have a query within a query. Is there a case logged for that?

          Show
          julianhyde Julian Hyde added a comment - No, we make this bug depend on that bug. Neither issue is fixed right now. Disabling the test case would just masks that fact. By the way, what do you mean by "nested druid queries"? I didn't think it was possible to have a query within a query. Is there a case logged for that?
          Hide
          bslim slim bouguerra added a comment -

          Yes druid can process an arbitrary number of nested group by where the inner query result is a datasource of the outer one.
          This bug is fixed by the patch i proposed here https://github.com/apache/calcite/pull/463. We simply do not push a count operator when it includes a column.

          Show
          bslim slim bouguerra added a comment - Yes druid can process an arbitrary number of nested group by where the inner query result is a datasource of the outer one. This bug is fixed by the patch i proposed here https://github.com/apache/calcite/pull/463 . We simply do not push a count operator when it includes a column.
          Hide
          julianhyde Julian Hyde added a comment -

          I would call that "chained" rather than "nested".

          Show
          julianhyde Julian Hyde added a comment - I would call that "chained" rather than "nested".
          Hide
          julianhyde Julian Hyde added a comment -

          Looks good, but I can't review/commit until I get the Druid test VM running again.

          Show
          julianhyde Julian Hyde added a comment - Looks good, but I can't review/commit until I get the Druid test VM running again.
          Hide
          jcamachorodriguez Jesus Camacho Rodriguez added a comment - - edited

          Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/12e020e .

          Follow-up http://git-wip-us.apache.org/repos/asf/calcite/commit/a088aab fixes DruidAdapterIT.testGroupByAvgSumCount and DruidAdapterIT.testTimeExtractThatCannotBePushed .

          Thanks slim bouguerra

          Show
          jcamachorodriguez Jesus Camacho Rodriguez added a comment - - edited Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/12e020e . Follow-up http://git-wip-us.apache.org/repos/asf/calcite/commit/a088aab fixes DruidAdapterIT.testGroupByAvgSumCount and DruidAdapterIT.testTimeExtractThatCannotBePushed . Thanks slim bouguerra
          Hide
          zhumayun Zain Humayun added a comment -

          I know this is marked as resolved, but I think there might be a problem.

          select count("customer_id") from "foodmart"; 
          

          Should not be pushed into Druid, and it is not with these changes, however both aggregates in the following query get pushed in:

          select count(*), count("customer_id") from "foodmart";
          

          Here is the final plan:

          EnumerableInterpreter
           DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], projects=[[$20]], groups=[{}], aggs=[[COUNT(), COUNT($0)]])
          

          I believe it's because the new logic in BAD_AGG returns on the first encounter of COUNT, however it should go to the next item in the aggregate call list if the COUNT can be pushed in.

          Show
          zhumayun Zain Humayun added a comment - I know this is marked as resolved, but I think there might be a problem. select count( "customer_id" ) from "foodmart" ; Should not be pushed into Druid, and it is not with these changes, however both aggregates in the following query get pushed in: select count(*), count( "customer_id" ) from "foodmart" ; Here is the final plan: EnumerableInterpreter DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], projects=[[$20]], groups=[{}], aggs=[[COUNT(), COUNT($0)]]) I believe it's because the new logic in BAD_AGG returns on the first encounter of COUNT , however it should go to the next item in the aggregate call list if the COUNT can be pushed in.
          Hide
          jcamachorodriguez Jesus Camacho Rodriguez added a comment -

          Zain Humayun, thanks for pinging us about this! Tbh I had not checked the change as I thought the patch had already been reviewed.

          http://git-wip-us.apache.org/repos/asf/calcite/commit/bdaa485 fixes the issue with multiple aggregate functions.

          Show
          jcamachorodriguez Jesus Camacho Rodriguez added a comment - Zain Humayun , thanks for pinging us about this! Tbh I had not checked the change as I thought the patch had already been reviewed. http://git-wip-us.apache.org/repos/asf/calcite/commit/bdaa485 fixes the issue with multiple aggregate functions.
          Hide
          jcamachorodriguez Jesus Camacho Rodriguez added a comment -

          Resolved in release 1.13.0 (2017-06-26).

          Show
          jcamachorodriguez Jesus Camacho Rodriguez added a comment - Resolved in release 1.13.0 (2017-06-26).

            People

            • Assignee:
              bslim slim bouguerra
              Reporter:
              bslim slim bouguerra
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development