Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-1569

Code generation for fields of type java.sql.Date

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.10.0
    • Fix Version/s: 1.12.0
    • Component/s: None
    • Labels:
      None

      Description

      Date condition can generates Integer == Integer, which is always false. Therefore a join condition that should match values fails to match them. Run the below query on calcite 1.10.

      select 
        l.cal_dt
        , sum(left_join_gvm) as left_join_sum
        , sum(inner_join_gvm) as inner_join_sum
      from
        (
          select test_kylin_fact.cal_dt, sum(price) as left_join_gvm
           from test_kylin_fact
          group by test_kylin_fact.cal_dt
        ) l
        ,
        (
          select test_kylin_fact.cal_dt, sum(price) as inner_join_gvm
           from test_kylin_fact
          group by test_kylin_fact.cal_dt
        ) i
      where
        l.cal_dt = i.cal_dt    -- this condition
      group by
        l.cal_dt
      

      The where condition generates Baz code like below.

      /* 284 */   final org.apache.calcite.linq4j.AbstractEnumerable child1 = new org.apache.calcite.linq4j.AbstractEnumerable(){
      /* 285 */     public org.apache.calcite.linq4j.Enumerator enumerator() {
      /* 286 */       return new org.apache.calcite.linq4j.Enumerator(){
      /* 287 */           public final org.apache.calcite.linq4j.Enumerator inputEnumerator = _inputEnumerable1.enumerator();
      /* 288 */           public void reset() {
      /* 289 */             inputEnumerator.reset();
      /* 290 */           }
      /* 291 */ 
      /* 292 */           public boolean moveNext() {
      /* 293 */             while (inputEnumerator.moveNext()) {
      /* 294 */               final Object[] current = (Object[]) inputEnumerator.current();
      /* 295 */               final Integer inp0_ = (Integer) current[0];
      /* 296 */               final Integer inp2_ = (Integer) current[2];
      /* 297 */               if (inp0_ != null && inp2_ != null && inp0_ == inp2_) {
      /* 298 */                 return true;
      /* 299 */               }
      /* 300 */             }
      /* 301 */             return false;
      /* 302 */           }
      /* 303 */ 
      /* 304 */           public void close() {
      /* 305 */             inputEnumerator.close();
      /* 306 */           }
      /* 307 */ 
      /* 308 */           public Object current() {
      /* 309 */             final Object[] current = (Object[]) inputEnumerator.current();
      /* 310 */             return new Object[] {
      /* 311 */                 current[0],
      /* 312 */                 current[1],
      /* 313 */                 current[3]};
      /* 314 */           }
      /* 315 */ 
      /* 316 */         };
      /* 317 */     }
      /* 318 */ 
      /* 319 */   };
      

      The problem is

       if (inp0_ != null && inp2_ != null && inp0_ == inp2_) 

      is always false, by using == to compare two Integers.

        Issue Links

          Activity

          Hide
          julianhyde Julian Hyde added a comment -

          Resolved in release 1.12.0 (2017-03-24).

          Show
          julianhyde Julian Hyde added a comment - Resolved in release 1.12.0 (2017-03-24).
          Hide
          julianhyde Julian Hyde added a comment -
          Show
          julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/d335e48c . Thanks for the PR, zhen wang !
          Hide
          liukaige Kaige Liu added a comment - - edited

          The compare condition between Dates works now. But still not sure why that "inp0_ == inp2_" condition is generated. We will do some investigations from our side. Please go ahead for this JIRA. Thanks guys!

          I used the CSV example data to verify, below is the test case:

          select * from  
                      (select EMPS.JOINEDAT, sum(AGE) as left_join_age from EMPS group by EMPS.JOINEDAT) l, 
                      (select EMPS.JOINEDAT, sum(AGE) as right_join_age from EMPS group by EMPS.JOINEDAT) r  
                      where l.JOINEDAT = r.JOINEDAT
          
          Show
          liukaige Kaige Liu added a comment - - edited The compare condition between Dates works now. But still not sure why that "inp0_ == inp2_" condition is generated. We will do some investigations from our side. Please go ahead for this JIRA. Thanks guys! I used the CSV example data to verify, below is the test case: select * from (select EMPS.JOINEDAT, sum(AGE) as left_join_age from EMPS group by EMPS.JOINEDAT) l, (select EMPS.JOINEDAT, sum(AGE) as right_join_age from EMPS group by EMPS.JOINEDAT) r where l.JOINEDAT = r.JOINEDAT
          Hide
          liyang.gmt8@gmail.com liyang added a comment -

          I've asked help from Kaige to verify the patch.

          Show
          liyang.gmt8@gmail.com liyang added a comment - I've asked help from Kaige to verify the patch.
          Hide
          zhenw zhen wang added a comment -

          https://github.com/apache/calcite/pull/356 tests fixed with `mvn clean test`
          ( previous was due to my mis-understanding when switch back to master, it is also failing there)

          Julian Hyde I agree with you regarding this is not idea. I don't want some place middle plugged in a special logic and say `Date` need to be specially handled. but this is what I have with several attempts.

          I don't quite understand `fix when they enter the system`, what about when they are eventually selected out as result ? convert them back?

          to me, the problem seems the `RexToLixTranslator#convert`. ideally the fromType should be derived from Expression. but this is not the case for Input Arrays. as there could be all kinds of types on the columns, so the expression eventually derive `Object` as its type. obviously this has lost precision, we know that a[0] is Date, but I didn't manage to pass that in elegantly.
          to me, this is the problem.

          Anyways. I agree with your comment, if anyone want this fix they could patch. and this should be targeted to be fixed more elegantly.

          Show
          zhenw zhen wang added a comment - https://github.com/apache/calcite/pull/356 tests fixed with `mvn clean test` ( previous was due to my mis-understanding when switch back to master, it is also failing there) Julian Hyde I agree with you regarding this is not idea. I don't want some place middle plugged in a special logic and say `Date` need to be specially handled. but this is what I have with several attempts. I don't quite understand `fix when they enter the system`, what about when they are eventually selected out as result ? convert them back? to me, the problem seems the `RexToLixTranslator#convert`. ideally the fromType should be derived from Expression. but this is not the case for Input Arrays. as there could be all kinds of types on the columns, so the expression eventually derive `Object` as its type. obviously this has lost precision, we know that a [0] is Date, but I didn't manage to pass that in elegantly. to me, this is the problem. Anyways. I agree with your comment, if anyone want this fix they could patch. and this should be targeted to be fixed more elegantly.
          Hide
          julianhyde Julian Hyde added a comment -

          zhen wang, I get test failures in JdbcTest.testNullableTimestamp and JdbcTest.testNullableTimestamp2. Can you fix them, please.

          Show
          julianhyde Julian Hyde added a comment - zhen wang , I get test failures in JdbcTest.testNullableTimestamp and JdbcTest.testNullableTimestamp2 . Can you fix them, please.
          Hide
          julianhyde Julian Hyde added a comment -

          The fix isn't ideal. I'd rather that we convert java.sql.Date values when they enter the system. And, if we allow java.sql.Date values then we should treat java.util.Date, java.sql.Time and java.sql.Timestamp values similarly. But being pragmatic, I'm going to accept the patch because it fixes cases that didn't work previously. If we see further problems in this area, we might back out the changes and do it properly.

          liyang, Can you please test your code against the patch and confirm that it fixes the problem.

          zhen wang, Can you please submit a pull request, and post the URL here.

          Show
          julianhyde Julian Hyde added a comment - The fix isn't ideal. I'd rather that we convert java.sql.Date values when they enter the system. And, if we allow java.sql.Date values then we should treat java.util.Date, java.sql.Time and java.sql.Timestamp values similarly. But being pragmatic, I'm going to accept the patch because it fixes cases that didn't work previously. If we see further problems in this area, we might back out the changes and do it properly. liyang , Can you please test your code against the patch and confirm that it fixes the problem. zhen wang , Can you please submit a pull request, and post the URL here.
          Hide
          liyang.gmt8@gmail.com liyang added a comment -

          Thanks guys for making the progress. Let me know if the original full source is still needed. I can reproduce the error on my side again.

          Show
          liyang.gmt8@gmail.com liyang added a comment - Thanks guys for making the progress. Let me know if the original full source is still needed. I can reproduce the error on my side again.
          Hide
          zhenw zhen wang added a comment - - edited

          wrote another fix: https://github.com/zinking/calcite/commit/e7d45aba315acbd2865f87f77077d1b33bfb977a

          generated code met what you expect

          /* 112 */           public boolean moveNext() {
          /* 113 */             while (inputEnumerator.moveNext()) {
          /* 114 */               final Object[] current = (Object[]) inputEnumerator.current();
          /* 115 */               if ((java.sql.Date) current[1] != null && (java.sql.Date) current[0] != null && org.apache.calcite.runtime.SqlFunctions.toInt(current[1]) >= org.apache.calcite.runtime.SqlFunctions.toInt(current[0])) {
          /* 116 */                 return true;
          /* 117 */               }
          /* 118 */             }
          /* 119 */             return false;
          /* 120 */           }
          

          Julian Hyde this could also pass existing tests. but maybe too many changes. let me know what you think

          Show
          zhenw zhen wang added a comment - - edited wrote another fix: https://github.com/zinking/calcite/commit/e7d45aba315acbd2865f87f77077d1b33bfb977a generated code met what you expect /* 112 */ public boolean moveNext() { /* 113 */ while (inputEnumerator.moveNext()) { /* 114 */ final Object [] current = ( Object []) inputEnumerator.current(); /* 115 */ if ((java.sql.Date) current[1] != null && (java.sql.Date) current[0] != null && org.apache.calcite.runtime.SqlFunctions.toInt(current[1]) >= org.apache.calcite.runtime.SqlFunctions.toInt(current[0])) { /* 116 */ return true ; /* 117 */ } /* 118 */ } /* 119 */ return false ; /* 120 */ } Julian Hyde this could also pass existing tests. but maybe too many changes. let me know what you think
          Hide
          julianhyde Julian Hyde added a comment -

          It appears that toInt(Date) will throw if the date is null, so you need to check for null dates first. I'd probably write

          /* 115 */ if (current[1] != null && current[0] != null && org.apache.calcite.runtime.SqlFunctions.toInt(current[1]) >= org.apache.calcite.runtime.SqlFunctions.toInt(current[0])) { 

          but I'd accept anything equivalent that comes out of the code generator.

          Show
          julianhyde Julian Hyde added a comment - It appears that toInt(Date) will throw if the date is null, so you need to check for null dates first. I'd probably write /* 115 */ if (current[1] != null && current[0] != null && org.apache.calcite.runtime.SqlFunctions.toInt(current[1]) >= org.apache.calcite.runtime.SqlFunctions.toInt(current[0])) { but I'd accept anything equivalent that comes out of the code generator.
          Hide
          zhenw zhen wang added a comment -

          hmm, what will this line look like in your mind
          `/* 115 */ if (org.apache.calcite.runtime.SqlFunctions.internalToDate(current[1]) != null && org.apache.calcite.runtime.SqlFunctions.internalToDate(current[0]) != null && (Integer) current[1] >= (Integer) current[0]) {`

          is it

          `/* 115 */ if (org.apache.calcite.runtime.SqlFunctions.toInt(current[1]) != null && org.apache.calcite.runtime.SqlFunctions.toInt(current[0]) != null && (Integer) toInt(current[1]) >= (Integer) toInt( current[0])) {`

          ?

          Show
          zhenw zhen wang added a comment - hmm, what will this line look like in your mind `/* 115 */ if (org.apache.calcite.runtime.SqlFunctions.internalToDate(current [1] ) != null && org.apache.calcite.runtime.SqlFunctions.internalToDate(current [0] ) != null && (Integer) current [1] >= (Integer) current [0] ) {` is it `/* 115 */ if (org.apache.calcite.runtime.SqlFunctions.toInt(current [1] ) != null && org.apache.calcite.runtime.SqlFunctions.toInt(current [0] ) != null && (Integer) toInt(current [1] ) >= (Integer) toInt( current [0] )) {` ?
          Hide
          julianhyde Julian Hyde added a comment -

          I don't like the direction you're going in that fix. We don't want any Date values at run time (and by the way, we should be treating DATE, TIME and TIMESTAMP consistently). The right fix would be to treat a nullable DATE column the same way we treat a nullable INTEGER column.

          Show
          julianhyde Julian Hyde added a comment - I don't like the direction you're going in that fix. We don't want any Date values at run time (and by the way, we should be treating DATE, TIME and TIMESTAMP consistently). The right fix would be to treat a nullable DATE column the same way we treat a nullable INTEGER column.
          Hide
          zhenw zhen wang added a comment -

          thanks Vladimir Sitnikov for the test case. it seems if the storage type is set to `int` the RexLix converter doesn't seem to know to represent Date as Integer. original 'Date' type gets lost when passing to the ArrayIndex Input ref builder

          I had some draft work here: https://github.com/zinking/calcite/commit/23373a62af17fb5499348cf0925fd8a371659fb7
          which fixes the tests. I assume these are the same for other comparison operators and TIMESTAMP etc..
          I can fix them altogether if this fix direction looks correct Julian Hyde

          Show
          zhenw zhen wang added a comment - thanks Vladimir Sitnikov for the test case. it seems if the storage type is set to `int` the RexLix converter doesn't seem to know to represent Date as Integer. original 'Date' type gets lost when passing to the ArrayIndex Input ref builder I had some draft work here: https://github.com/zinking/calcite/commit/23373a62af17fb5499348cf0925fd8a371659fb7 which fixes the tests. I assume these are the same for other comparison operators and TIMESTAMP etc.. I can fix them altogether if this fix direction looks correct Julian Hyde
          Hide
          vladimirsitnikov Vladimir Sitnikov added a comment -

          liyang, can you please provide full debug source?

          When I try to join on dates, Calcite creates .join call, thus it does not generate that inp0_ == inp2_ condition.

          On the other hand, the following cases break in different ways (the tests to be placed in org.apache.calcite.test.ReflectiveSchemaTest):

          1)

              CalciteAssert.that()
                  .withSchema("s", CATCHALL)
                  .query("select a.v from (select \"sqlDate\" v from \"s\".\"everyTypes\" group by \"sqlDate\") a" +
                          ", (select \"sqlDate\" v from \"s\".\"everyTypes\" group by \"sqlDate\") b where a.v = b.v group by a.v")
                  .returnsUnordered("value=1", "value=3", "value=5");
          

          Produces "V=1970-01-01\nV=null"
          I do not think null row is valid here as "a.v = b.v" should not allow NULL string.

          2) greater or equal produces error:

                CalciteAssert.that()
                    .withSchema("s", CATCHALL)
                    .query("select a.v from (select \"sqlDate\" v from \"s\".\"everyTypes\" group by \"sqlDate\") a" +
                          ", (select \"sqlDate\" v from \"s\".\"everyTypes\" group by \"sqlDate\") b where a.v >= b.v group by a.v")
                  .returnsUnordered("value=1", "value=3", "value=5");
          
          /* 112 */           public boolean moveNext() {
          /* 113 */             while (inputEnumerator.moveNext()) {
          /* 114 */               final Object[] current = (Object[]) inputEnumerator.current();
          /* 115 */               if (org.apache.calcite.runtime.SqlFunctions.internalToDate(current[1]) != null && org.apache.calcite.runtime.SqlFunctions.internalToDate(current[0]) != null && (Integer) current[1] >= (Integer) current[0]) {
          /* 116 */                 return true;
          /* 117 */               }
          /* 118 */             }
          /* 119 */             return false;
          /* 120 */           }
          
          
          Caused by: org.codehaus.commons.compiler.CompileException: Line 115, Column 74: No applicable constructor/method found for actual parameters "java.lang.Object"; candidates are: "public static java.sql.Date org.apache.calcite.runtime.SqlFunctions.internalToDate(int)", "public static java.sql.Date org.apache.calcite.runtime.SqlFunctions.internalToDate(java.lang.Integer)"
          Show
          vladimirsitnikov Vladimir Sitnikov added a comment - liyang , can you please provide full debug source? When I try to join on dates, Calcite creates .join call, thus it does not generate that inp0_ == inp2_ condition. On the other hand, the following cases break in different ways (the tests to be placed in org.apache.calcite.test.ReflectiveSchemaTest ): 1) CalciteAssert.that() .withSchema( "s" , CATCHALL) .query( "select a.v from (select \" sqlDate\ " v from \" s\ ".\" everyTypes\ " group by \" sqlDate\ ") a" + ", (select \" sqlDate\ " v from \" s\ ".\" everyTypes\ " group by \" sqlDate\ ") b where a.v = b.v group by a.v" ) .returnsUnordered( "value=1" , "value=3" , "value=5" ); Produces "V=1970-01-01\nV=null" I do not think null row is valid here as "a.v = b.v" should not allow NULL string. 2) greater or equal produces error: CalciteAssert.that() .withSchema( "s" , CATCHALL) .query( "select a.v from (select \" sqlDate\ " v from \" s\ ".\" everyTypes\ " group by \" sqlDate\ ") a" + ", (select \" sqlDate\ " v from \" s\ ".\" everyTypes\ " group by \" sqlDate\ ") b where a.v >= b.v group by a.v" ) .returnsUnordered( "value=1" , "value=3" , "value=5" ); /* 112 */ public boolean moveNext() { /* 113 */ while (inputEnumerator.moveNext()) { /* 114 */ final Object[] current = (Object[]) inputEnumerator.current(); /* 115 */ if (org.apache.calcite.runtime.SqlFunctions.internalToDate(current[1]) != null && org.apache.calcite.runtime.SqlFunctions.internalToDate(current[0]) != null && (Integer) current[1] >= (Integer) current[0]) { /* 116 */ return true; /* 117 */ } /* 118 */ } /* 119 */ return false; /* 120 */ } Caused by: org.codehaus.commons.compiler.CompileException: Line 115, Column 74: No applicable constructor/method found for actual parameters "java.lang.Object"; candidates are: "public static java.sql.Date org.apache.calcite.runtime.SqlFunctions.internalToDate(int)", "public static java.sql.Date org.apache.calcite.runtime.SqlFunctions.internalToDate(java.lang.Integer)"

            People

            • Assignee:
              zhenw zhen wang
              Reporter:
              liyang.gmt8@gmail.com liyang
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development