Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-516

GROUP BY on a CASE expression containing IN predicate fails

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.1-incubating
    • Fix Version/s: 1.0.0-incubating
    • Component/s: None
    • Labels:

      Description

      The following query which does a Group-By on a CASE expression that contains IN predicate fails. If I replace the IN with an equality or <, > the query succeeds. I am not on latest Calcite master but I suspect the same problem exists there.

      select (case when emp.empno in (3) then 0 else 1 end) 
        from emp 
      group by (case when emp.empno in (3) then 0 else 1 end);
      
      java.lang.AssertionError: Internal error: while converting CASE WHEN `EMP`.`EMPNO` IN (3) THEN 0 ELSE 1 END
      	at org.eigenbase.util.Util.newInternal(Util.java:750)
      	at org.eigenbase.sql2rel.ReflectiveConvertletTable$1.convertCall(ReflectiveConvertletTable.java:93)
      	at org.eigenbase.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:52)
      	at org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4093)
      	at org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:1)
      	at org.eigenbase.sql.SqlCall.accept(SqlCall.java:125)
      	at org.eigenbase.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:3988)
      	at org.eigenbase.sql2rel.SqlToRelConverter$AggConverter.addGroupExpr(SqlToRelConverter.java:4314)
      	at org.eigenbase.sql2rel.SqlToRelConverter.createAggImpl(SqlToRelConverter.java:2240)
      	at org.eigenbase.sql2rel.SqlToRelConverter.convertAgg(SqlToRelConverter.java:2191)
      	at org.eigenbase.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:514)
      	at org.eigenbase.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:474)
      

        Issue Links

          Activity

          Hide
          julianhyde Julian Hyde added a comment -

          Closing now that 1.0.0-incubating has been released.

          Show
          julianhyde Julian Hyde added a comment - Closing now that 1.0.0-incubating has been released.
          Hide
          julianhyde Julian Hyde added a comment -

          It's not in general possible to test asserts. The cause of an assert failure is an internal error - a mistake in the code. End-user errors cannot trigger them, and neither can unit tests unless they access internals.

          This particular assert is saying that it thinks the validator did not do its job, the statement is invalid, and it should never have been submitted for sql-to-rel translation.

          Show
          julianhyde Julian Hyde added a comment - It's not in general possible to test asserts. The cause of an assert failure is an internal error - a mistake in the code. End-user errors cannot trigger them, and neither can unit tests unless they access internals. This particular assert is saying that it thinks the validator did not do its job, the statement is invalid, and it should never have been submitted for sql-to-rel translation.
          Hide
          amansinha100 Aman Sinha added a comment -

          Ok, sounds reasonable to add the assert back as part of CALCITE-551 perhaps with a negative unit test case that hits the assert.

          Show
          amansinha100 Aman Sinha added a comment - Ok, sounds reasonable to add the assert back as part of CALCITE-551 perhaps with a negative unit test case that hits the assert.
          Hide
          julianhyde Julian Hyde added a comment -

          The assertion is valid. You can't reference columns that are not in the GROUP BY clause if you're not inside an aggregate function. However, our logic to detect sub-query boundaries is broken when sub-queries are involved. IN is handled as a sub-query, so hits the broken logic.

          I removed the assert but it should be put back when we have fixed CALCITE-551.

          Show
          julianhyde Julian Hyde added a comment - The assertion is valid. You can't reference columns that are not in the GROUP BY clause if you're not inside an aggregate function. However, our logic to detect sub-query boundaries is broken when sub-queries are involved. IN is handled as a sub-query, so hits the broken logic. I removed the assert but it should be put back when we have fixed CALCITE-551 .
          Hide
          julianhyde Julian Hyde added a comment -
          Show
          julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/incubator-calcite/commit/46b56918 . Thanks for the patch, Aman!
          Hide
          amansinha100 Aman Sinha added a comment -

          Uploaded patch that fixes both queries in this JIRA. Added unit tests. Please review. As mentioned in the previous comment, I had to remove an assertion that seemed too aggressive or unnecessary. If anyone has a negative test case that is supposed to be detected by that assertion, let me know.

          Show
          amansinha100 Aman Sinha added a comment - Uploaded patch that fixes both queries in this JIRA. Added unit tests. Please review. As mentioned in the previous comment, I had to remove an assertion that seemed too aggressive or unnecessary. If anyone has a negative test case that is supposed to be detected by that assertion, let me know.
          Hide
          amansinha100 Aman Sinha added a comment -

          The following query with an aggregation but without group-by also fails:

             SELECT SUM(CASE WHEN empno IN (3) THEN 0 ELSE 1 END) FROM emp;
          
          java.lang.AssertionError: Internal error: Identifier 'EMP.EMPNO' is not a group expr
          	at org.apache.calcite.util.Util.newInternal(Util.java:727)
          	at org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:3307)
          	at org.apache.calcite.sql2rel.SqlToRelConverter.access$6(SqlToRelConverter.java:3296)
          	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4299)
          	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:1)
          	at org.apache.calcite.sql.SqlIdentifier.accept(SqlIdentifier.java:271)
          	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4182)
          	at org.apache.calcite.sql2rel.SqlToRelConverter.substituteSubquery(SqlToRelConverter.java:979)
          	at org.apache.calcite.sql2rel.SqlToRelConverter.replaceSubqueries(SqlToRelConverter.java:937)
          	at org.apache.calcite.sql2rel.SqlToRelConverter.createAggImpl(SqlToRelConverter.java:2685)
          	at org.apache.calcite.sql2rel.SqlToRelConverter.convertAgg(SqlToRelConverter.java:2525)
          	at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:608)
          

          It is not clear what is the purpose of this assertion in convertIdentifier(). It just checks for bb.agg != null and throws an internal error. Removing this assert allows the query to complete and all other unit tests passed; however I am not sure if there are negative tests coverage.

          Show
          amansinha100 Aman Sinha added a comment - The following query with an aggregation but without group-by also fails: SELECT SUM(CASE WHEN empno IN (3) THEN 0 ELSE 1 END) FROM emp; java.lang.AssertionError: Internal error: Identifier 'EMP.EMPNO' is not a group expr at org.apache.calcite.util.Util.newInternal(Util.java:727) at org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:3307) at org.apache.calcite.sql2rel.SqlToRelConverter.access$6(SqlToRelConverter.java:3296) at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4299) at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:1) at org.apache.calcite.sql.SqlIdentifier.accept(SqlIdentifier.java:271) at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4182) at org.apache.calcite.sql2rel.SqlToRelConverter.substituteSubquery(SqlToRelConverter.java:979) at org.apache.calcite.sql2rel.SqlToRelConverter.replaceSubqueries(SqlToRelConverter.java:937) at org.apache.calcite.sql2rel.SqlToRelConverter.createAggImpl(SqlToRelConverter.java:2685) at org.apache.calcite.sql2rel.SqlToRelConverter.convertAgg(SqlToRelConverter.java:2525) at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:608) It is not clear what is the purpose of this assertion in convertIdentifier(). It just checks for bb.agg != null and throws an internal error. Removing this assert allows the query to complete and all other unit tests passed; however I am not sure if there are negative tests coverage.
          Hide
          julianhyde Julian Hyde added a comment -

          I think you're on the right track. Note that the error occurs during sql-to-rel. It has passed through validation successfully, and recognized that, in SqlNode format, the two occurrences of case when emp.empno in (3) then 0 else 1 end are structurally identical, as required for them to be grouping expressions. But it looks as if the expressions are translated separately and therefore their RexNode representations look different. IIRC there's some logic in there to track aliased expressions through translation, but if so, it's not working correctly here.

          Show
          julianhyde Julian Hyde added a comment - I think you're on the right track. Note that the error occurs during sql-to-rel. It has passed through validation successfully, and recognized that, in SqlNode format, the two occurrences of case when emp.empno in (3) then 0 else 1 end are structurally identical, as required for them to be grouping expressions. But it looks as if the expressions are translated separately and therefore their RexNode representations look different. IIRC there's some logic in there to track aliased expressions through translation, but if so, it's not working correctly here.
          Hide
          amansinha100 Aman Sinha added a comment -

          There are 2 issues here:
          1. createAggImpl() calls replaceSubqueries() only for the aggList. It also needs to call replaceSubqueries for the groupList. Making this change allows the CASE conversion to go through.
          2. Subsequently, I get an error that EMP.EMPNO is not a group expr. I think that the mapSubqueryToExpr is only keeping track of the column on the left side of IN whereas it should use the CASE expression itself as the key.
          I will look into this some more...

          Show
          amansinha100 Aman Sinha added a comment - There are 2 issues here: 1. createAggImpl() calls replaceSubqueries() only for the aggList. It also needs to call replaceSubqueries for the groupList. Making this change allows the CASE conversion to go through. 2. Subsequently, I get an error that EMP.EMPNO is not a group expr. I think that the mapSubqueryToExpr is only keeping track of the column on the left side of IN whereas it should use the CASE expression itself as the key. I will look into this some more...

            People

            • Assignee:
              julianhyde Julian Hyde
              Reporter:
              amansinha100 Aman Sinha
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development