Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-1800

JDBC adapter fails on query with UNION in FROM clause

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.9.0
    • Fix Version/s: 1.13.0
    • Component/s: core
    • Labels:
      None

      Description

      Relational representation of the following query fails to be converter to sql string in Calcite:

      select sum(case when "product_id"=0 then "net_weight" else 0 end) as net_weight
      from ( select "product_id", "net_weight" from "product"  
      union all
      select "product_id", 0 as "net_weight" from "sales_fact_1997" ) t0
      

      The conversion fails with:

      java.lang.RuntimeException: While invoking method 'public org.apache.calcite.rel.rel2sql.SqlImplementor$Result org.apache.calcite.rel.rel2sql.RelToSqlConverter.visit(org.apache.calcite.rel.core.Aggregate)'
      
      	at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:528)
      	at org.apache.calcite.rel.rel2sql.RelToSqlConverter.dispatch(RelToSqlConverter.java:96)
      	at org.apache.calcite.rel.rel2sql.RelToSqlConverter.visitChild(RelToSqlConverter.java:100)
      	at org.apache.calcite.rel.rel2sql.RelToSqlConverterTest$Sql.ok(RelToSqlConverterTest.java:1559)
      	at org.apache.calcite.rel.rel2sql.RelToSqlConverterTest.testUnionWrappedInASelect(RelToSqlConverterTest.java:654)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
      	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
      	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
      	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
      	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
      	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
      	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
      	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
      	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
      	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
      	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
      	at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
      	at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:117)
      	at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:42)
      	at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:262)
      	at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:84)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
      Caused by: java.lang.reflect.InvocationTargetException
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:525)
      	... 31 more
      Caused by: java.lang.RuntimeException: While invoking method 'public org.apache.calcite.rel.rel2sql.SqlImplementor$Result org.apache.calcite.rel.rel2sql.RelToSqlConverter.visit(org.apache.calcite.rel.core.Project)'
      	at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:528)
      	at org.apache.calcite.rel.rel2sql.RelToSqlConverter.dispatch(RelToSqlConverter.java:96)
      	at org.apache.calcite.rel.rel2sql.RelToSqlConverter.visitChild(RelToSqlConverter.java:100)
      	at org.apache.calcite.rel.rel2sql.RelToSqlConverter.visit(RelToSqlConverter.java:179)
      	... 36 more
      Caused by: java.lang.reflect.InvocationTargetException
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:525)
      	... 39 more
      Caused by: java.lang.AssertionError: field ordinal 0 out of range {t1=RecordType(DOUBLE $f0)}
      	at org.apache.calcite.rel.rel2sql.SqlImplementor$AliasContext.field(SqlImplementor.java:889)
      	at org.apache.calcite.rel.rel2sql.SqlImplementor$Context.toSql(SqlImplementor.java:521)
      

      Test case for the issue (RelToSqlConverterTest):

        @Test public void testUnionWrappedInASelect() {
          String query =
                    "select sum(case when \"product_id\"=0 then \"net_weight\" else 0 end) as net_weight from ( "
                  + "select \"product_id\", \"net_weight\" from \"product\" "
                  + " union all "
                  + "select \"product_id\", 0 as \"net_weight\" from \"sales_fact_1997\" "
                  + ") t0";
          String expected =
                    "SELECT SUM(CASE WHEN \"product_id\" = 0 THEN \"net_weight\" ELSE 0 END) AS \"NET_WEIGHT\"\n"
                  + "FROM (SELECT \"product_id\", \"net_weight\"\n"
                  + "FROM \"foodmart\".\"product\"\n"
                  + "UNION ALL\n"
                  + "SELECT \"product_id\", 0 AS \"net_weight\"\n"
                  + "FROM \"foodmart\".\"sales_fact_1997\") AS \"t1\"";
          sql(query).ok(expected);
        }
      

      The issue is caused by the fix: CALCITE-1332

      The issue should be resolved, as the case is pretty common and the failing sql is fairly simple.

        Activity

        Hide
        victor.batyt@gmail.com Viktor Batytskyi added a comment -
        Show
        victor.batyt@gmail.com Viktor Batytskyi added a comment - Julian Hyde Please take a look: https://github.com/apache/calcite/pull/454
        Hide
        julianhyde Julian Hyde added a comment -

        I can reproduce the case, and I have verified that the one line fix solves the problem, but I don't understand quite why. My intuition says that a UNION doesn't per se require an alias, yet we are requiring one.

        MinJi Kim, Can please you review this fix?

        Show
        julianhyde Julian Hyde added a comment - I can reproduce the case, and I have verified that the one line fix solves the problem, but I don't understand quite why. My intuition says that a UNION doesn't per se require an alias, yet we are requiring one. MinJi Kim , Can please you review this fix?
        Hide
        minjikim MinJi Kim added a comment - - edited

        Sure thing! I will take a look at it soon.

        Show
        minjikim MinJi Kim added a comment - - edited Sure thing! I will take a look at it soon.
        Hide
        minjikim MinJi Kim added a comment - - edited

        It looks like the problem is whether we should create a new AliasContext. The plan we get after conversion is as follows. After LogicalUnion, we have an alias "t1" for the subtree including LogicalUnion and below, with row type (product_id, net_weight).

        LogicalAggregate(group=[{}], NET_WEIGHT=[SUM($0)])
          LogicalProject($f0=[CASE(=($0, 0), $1, 0)])
            LogicalUnion(all=[true])
              LogicalProject(product_id=[$1], net_weight=[$7])
                LogicalTableScan(table=[[foodmart, product]])
              LogicalProject(product_id=[$0], net_weight=[0])
                LogicalTableScan(table=[[foodmart, sales_fact_1997]])
        

        Sine Unions are SET_OP operations, in the Result.builder(), we set the flag needNew to true (selectList is null in this case) when visiting LogicalProject above the Union, and so we enter the following if-clause. Now, in this case, the LogicalProject is reducing the row type to a single column, so we now generate an incorrect row type for alias "t1". Basically, when the alias is unchanging, we shouldn't be creating a new alias.

              } else {
                boolean qualified =
                    !dialect.hasImplicitTableAlias() || aliases.size() > 1;
                if (needNew) {  // <--- this is where the patch fixes so that we don't enter here.
                  newContext =
                      aliasContext(ImmutableMap.of(neededAlias, rel.getRowType()),
                          qualified);
                } else {
                  newContext = aliasContext(aliases, qualified);
                }
              }
        

        At the minimum, I would change the patch so that it checks that aliases.size() == 1 and that the aliases are the same (i.e. neededAlias is equal to the only key is aliases) – if there are two aliases say "t1" and "t2" in aliases, and we are collapsing the two aliases into a single alias "t1", even if "t1" is contained in aliases, we should actually create a new alias. But, I wonder if we should get rid of it all and revert the code back to what it used to be. Because I think in this case, we have no selectList, there shouldn't be any rowtype change.

              } else {
                boolean qualified =
                    !dialect.hasImplicitTableAlias() || aliases.size() > 1;
                newContext = aliasContext(aliases, qualified);
              }
        

        In this case, only one test fails RelToSqlConverterTest.testSelectQueryWithGroupByHaving3(). I am looking into that right now.

        Show
        minjikim MinJi Kim added a comment - - edited It looks like the problem is whether we should create a new AliasContext. The plan we get after conversion is as follows. After LogicalUnion, we have an alias "t1" for the subtree including LogicalUnion and below, with row type (product_id, net_weight). LogicalAggregate(group=[{}], NET_WEIGHT=[SUM($0)]) LogicalProject($f0=[CASE(=($0, 0), $1, 0)]) LogicalUnion(all=[ true ]) LogicalProject(product_id=[$1], net_weight=[$7]) LogicalTableScan(table=[[foodmart, product]]) LogicalProject(product_id=[$0], net_weight=[0]) LogicalTableScan(table=[[foodmart, sales_fact_1997]]) Sine Unions are SET_OP operations, in the Result.builder(), we set the flag needNew to true (selectList is null in this case) when visiting LogicalProject above the Union, and so we enter the following if-clause. Now, in this case, the LogicalProject is reducing the row type to a single column, so we now generate an incorrect row type for alias "t1". Basically, when the alias is unchanging, we shouldn't be creating a new alias. } else { boolean qualified = !dialect.hasImplicitTableAlias() || aliases.size() > 1; if (needNew) { // <--- this is where the patch fixes so that we don't enter here. newContext = aliasContext(ImmutableMap.of(neededAlias, rel.getRowType()), qualified); } else { newContext = aliasContext(aliases, qualified); } } At the minimum, I would change the patch so that it checks that aliases.size() == 1 and that the aliases are the same (i.e. neededAlias is equal to the only key is aliases) – if there are two aliases say "t1" and "t2" in aliases, and we are collapsing the two aliases into a single alias "t1", even if "t1" is contained in aliases, we should actually create a new alias. But, I wonder if we should get rid of it all and revert the code back to what it used to be. Because I think in this case, we have no selectList, there shouldn't be any rowtype change. } else { boolean qualified = !dialect.hasImplicitTableAlias() || aliases.size() > 1; newContext = aliasContext(aliases, qualified); } In this case, only one test fails RelToSqlConverterTest.testSelectQueryWithGroupByHaving3(). I am looking into that right now.
        Hide
        minjikim MinJi Kim added a comment - - edited

        It looks like my previous suggestion is not going to work. In RelToSqlConverterTest.testSelectQueryWithGroupByHaving3(), we have the following query plan:

        LogicalProject(product_id=[$0], EXPR$1=[$1])
          LogicalFilter(condition=[>($0, 100)])
            LogicalProject(product_id=[$0], EXPR$1=[$1])
              LogicalFilter(condition=[>($2, 1)])
                LogicalAggregate(group=[{0}], EXPR$1=[MIN($1)], agg#1=[COUNT()])
                  LogicalProject(product_id=[$1], store_id=[$19])
                    LogicalJoin(condition=[=($1, $15)], joinType=[inner])
                      LogicalTableScan(table=[[foodmart, product]])
                      LogicalTableScan(table=[[foodmart, sales_fact_1997]])
        

        The subtree below "LogicalFilter(condition=[>($0, 100)])" generates the following sql query, which I believe is correct. Now, with the additional filter, "LogicalFilter(condition=[>($0, 100)])", we need to make sure that a new AliasContext is generated.

        SELECT `product`.`product_id`, MIN(`sales_fact_1997`.`store_id`)
        FROM `foodmart`.`product`
        INNER JOIN `foodmart`.`sales_fact_1997` ON `product`.`product_id` = `sales_fact_1997`.`product_id`
        GROUP BY `product`.`product_id`
        HAVING COUNT(*) > 1
        

        In this case, selectList is null since this is a filter, and we fall through and never generate a new AliasContext if we use my previously proposed fix. So, actually we do need the if-else-clause as before. I do think that we need to be a bit more specific as to when we enter the if-else-clause. We not only need check that we used the new alias (i.e. needNew is true and neededAlias != null), but also check that our aliases is of size 1. Since we could have aliases = [<t1, rowType1>, <t2, rowType2>], with neededAlias = t1 (although I believe this should not happen in our code currently, but I think if our aliases context has two elements, "t1" and "t2", and we want to re-name the entire subquery as "t1", that should be a new AliasContext with a single element).

              } else {
                boolean qualified =
                    !dialect.hasImplicitTableAlias() || aliases.size() > 1;
                // basically, we did a subSelect() since needNew is set and neededAlias is not null
                // now, we need to make sure that we need to update the alias context.
                // if our aliases map has a single element:  <neededAlias, rowType>,
                // then we don't need to rewrite the alias but otherwise, it should be updated.
                if (needNew
                    && neededAlias != null
                    && (aliases.size() != 1 || !aliases.containsKey(neededAlias))) {
                  newContext = aliasContext(ImmutableMap.of(neededAlias, rel.getRowType()), qualified);
                } else {
                  newContext = aliasContext(aliases, qualified);
                }
              }
              return new Builder(rel, clauseList, select, newContext, aliases);
        

        Julian Hyde and Viktor Batytskyi does this sound reasonable? With above change, all tests seem to pass too.

        Show
        minjikim MinJi Kim added a comment - - edited It looks like my previous suggestion is not going to work. In RelToSqlConverterTest.testSelectQueryWithGroupByHaving3(), we have the following query plan: LogicalProject(product_id=[$0], EXPR$1=[$1]) LogicalFilter(condition=[>($0, 100)]) LogicalProject(product_id=[$0], EXPR$1=[$1]) LogicalFilter(condition=[>($2, 1)]) LogicalAggregate(group=[{0}], EXPR$1=[MIN($1)], agg#1=[COUNT()]) LogicalProject(product_id=[$1], store_id=[$19]) LogicalJoin(condition=[=($1, $15)], joinType=[ inner ]) LogicalTableScan(table=[[foodmart, product]]) LogicalTableScan(table=[[foodmart, sales_fact_1997]]) The subtree below "LogicalFilter(condition= [>($0, 100)] )" generates the following sql query, which I believe is correct. Now, with the additional filter, "LogicalFilter(condition= [>($0, 100)] )", we need to make sure that a new AliasContext is generated. SELECT `product`.`product_id`, MIN(`sales_fact_1997`.`store_id`) FROM `foodmart`.`product` INNER JOIN `foodmart`.`sales_fact_1997` ON `product`.`product_id` = `sales_fact_1997`.`product_id` GROUP BY `product`.`product_id` HAVING COUNT(*) > 1 In this case, selectList is null since this is a filter, and we fall through and never generate a new AliasContext if we use my previously proposed fix. So, actually we do need the if-else-clause as before. I do think that we need to be a bit more specific as to when we enter the if-else-clause. We not only need check that we used the new alias (i.e. needNew is true and neededAlias != null), but also check that our aliases is of size 1. Since we could have aliases = [<t1, rowType1>, <t2, rowType2>] , with neededAlias = t1 (although I believe this should not happen in our code currently, but I think if our aliases context has two elements, "t1" and "t2", and we want to re-name the entire subquery as "t1", that should be a new AliasContext with a single element). } else { boolean qualified = !dialect.hasImplicitTableAlias() || aliases.size() > 1; // basically, we did a subSelect() since needNew is set and neededAlias is not null // now, we need to make sure that we need to update the alias context. // if our aliases map has a single element: <neededAlias, rowType>, // then we don't need to rewrite the alias but otherwise, it should be updated. if (needNew && neededAlias != null && (aliases.size() != 1 || !aliases.containsKey(neededAlias))) { newContext = aliasContext(ImmutableMap.of(neededAlias, rel.getRowType()), qualified); } else { newContext = aliasContext(aliases, qualified); } } return new Builder(rel, clauseList, select, newContext, aliases); Julian Hyde and Viktor Batytskyi does this sound reasonable? With above change, all tests seem to pass too.
        Hide
        julianhyde Julian Hyde added a comment -

        MinJi Kim, I was thinking on similar lines (but I didn't have the stamina to turn it into a real fix). Sounds good, and thank you.

        Viktor Batytskyi, What do you think?

        Show
        julianhyde Julian Hyde added a comment - MinJi Kim , I was thinking on similar lines (but I didn't have the stamina to turn it into a real fix). Sounds good, and thank you. Viktor Batytskyi , What do you think?
        Hide
        victor.batyt@gmail.com Viktor Batytskyi added a comment -

        MinJi Kim Julian Hyde
        The improvement seems reasonable, additional checks won't hurt. Applied them:
        https://github.com/apache/calcite/pull/454

        Thank you, MinJi Kim and Julian Hyde!

        Show
        victor.batyt@gmail.com Viktor Batytskyi added a comment - MinJi Kim Julian Hyde The improvement seems reasonable, additional checks won't hurt. Applied them: https://github.com/apache/calcite/pull/454 Thank you, MinJi Kim and Julian Hyde !
        Hide
        minjikim MinJi Kim added a comment - - edited

        +1 Looks good to me. Thanks, Viktor Batytskyi

        Julian Hyde Do you think it is ready?

        Show
        minjikim MinJi Kim added a comment - - edited +1 Looks good to me. Thanks, Viktor Batytskyi Julian Hyde Do you think it is ready?
        Hide
        julianhyde Julian Hyde added a comment -

        Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/f6af061c. Thanks for the PR, Viktor Batytskyi! MinJi Kim, I added your name to the commit message because some of the code was yours.

        Show
        julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/f6af061c . Thanks for the PR, Viktor Batytskyi ! MinJi Kim , I added your name to the commit message because some of the code was yours.
        Hide
        jcamachorodriguez Jesus Camacho Rodriguez added a comment -

        Resolved in release 1.13.0 (2017-06-26).

        Show
        jcamachorodriguez Jesus Camacho Rodriguez added a comment - Resolved in release 1.13.0 (2017-06-26).

          People

          • Assignee:
            julianhyde Julian Hyde
            Reporter:
            victor.batyt@gmail.com Viktor Batytskyi
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 2h
              2h
              Remaining:
              Remaining Estimate - 2h
              2h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development