Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-2041

When ReduceExpressionRule simplifies a nullable expression, allow the result to change type to NOT NULL

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.15.0
    • Component/s: None
    • Labels:
      None

      Description

      In some cases, the user needs to select whether or not to add casts that match nullability.
      One of the motivations behind this is to avoid unnecessary casts like the following example.
      original filter

      OR(AND(>=($0, CAST(_UTF-16LE'2010-01-01 00:00:00 UTC'):TIMESTAMP_WITH_LOCAL_TIME_ZONE(15)), <=($0, CAST(_UTF-16LE'2012-03-01 00:00:00 UTC'):TIMESTAMP_WITH_LOCAL_TIME_ZONE(15))))
      

      the optimized expression with matching nullability

      OR(AND(CAST(>=($0, 2010-01-01 00:00:00)):BOOLEAN, CAST(<=($0, 2012-03-01 00:00:00)):BOOLEAN))
      

      As you can see this extra cast gets into the way of following plan optimization steps.
      The desired expression can be obtained by turning off the nullability matching.

      OR(AND(>=($0, 2010-01-01 00:00:00), <=($0, 2012-03-01 00:00:00)))
      

        Issue Links

          Activity

          Hide
          julianhyde Julian Hyde added a comment -

          Resolved in release 1.15.0 (2017-12-11).

          Show
          julianhyde Julian Hyde added a comment - Resolved in release 1.15.0 (2017-12-11).
          Hide
          julianhyde Julian Hyde added a comment -

          Fixed in 0ea976ee. Thanks, slim bouguerra and Jesus Camacho Rodriguez.

          I also parameterized the existing ReduceExpressionRule sub-classes so that we can test them in the new matchNullability mode.

          Show
          julianhyde Julian Hyde added a comment - Fixed in 0ea976ee . Thanks, slim bouguerra and Jesus Camacho Rodriguez . I also parameterized the existing ReduceExpressionRule sub-classes so that we can test them in the new matchNullability mode.
          Hide
          jcamachorodriguez Jesus Camacho Rodriguez added a comment -

          Julian Hyde, I have added a unit test and additional comments in : https://github.com/apache/calcite/pull/570

          Show
          jcamachorodriguez Jesus Camacho Rodriguez added a comment - Julian Hyde , I have added a unit test and additional comments in : https://github.com/apache/calcite/pull/570
          Hide
          julianhyde Julian Hyde added a comment -

          We would like this to go into 1.15 but the PR isn't ready. The PR needs tests, and does not have a clear explanation of the new functionality.

          Jesus Camacho Rodriguez and slim bouguerra, it's possible that you are looking at this from a Hive perspective, and see this as a just a small change in Calcite necessary to make Hive work. But the PR needs to be self-contained from Calcite's perspective too. Can you please revisit it?

          Show
          julianhyde Julian Hyde added a comment - We would like this to go into 1.15 but the PR isn't ready. The PR needs tests, and does not have a clear explanation of the new functionality. Jesus Camacho Rodriguez and slim bouguerra , it's possible that you are looking at this from a Hive perspective, and see this as a just a small change in Calcite necessary to make Hive work. But the PR needs to be self-contained from Calcite's perspective too. Can you please revisit it?
          Hide
          jcamachorodriguez Jesus Camacho Rodriguez added a comment - - edited

          Potentially this could cause the whole expression to change type from nullable to not null. So if the type of the whole expression needs to be preserved (think project expressions) compensating casts will need to be added at the top.

          In fact this flag will only be 'false' (i.e. do not match nullability) on the Hive side for the Filter instance of ReduceExpressionsRule. Setting the flag to 'false' for the Project instance of the rule would potentially cause issues with the planner, as the nullability of an expression row type generated by a rule needs to be equal to the one of the expression that was matched by the rule (we talked about this in another issue).

          Show
          jcamachorodriguez Jesus Camacho Rodriguez added a comment - - edited Potentially this could cause the whole expression to change type from nullable to not null. So if the type of the whole expression needs to be preserved (think project expressions) compensating casts will need to be added at the top. In fact this flag will only be 'false' (i.e. do not match nullability) on the Hive side for the Filter instance of ReduceExpressionsRule. Setting the flag to 'false' for the Project instance of the rule would potentially cause issues with the planner, as the nullability of an expression row type generated by a rule needs to be equal to the one of the expression that was matched by the rule (we talked about this in another issue).
          Hide
          julianhyde Julian Hyde added a comment -

          Nullability can only change in one direction, so let's be specific: a1 has a nullable type, and a2 is a not-null value.

          Potentially this could cause the whole expression to change type from nullable to not null. So if the type of the whole expression needs to be preserved (think project expressions) compensating casts will need to be added at the top.

          Show
          julianhyde Julian Hyde added a comment - Nullability can only change in one direction, so let's be specific: a1 has a nullable type, and a2 is a not-null value. Potentially this could cause the whole expression to change type from nullable to not null. So if the type of the whole expression needs to be preserved (think project expressions) compensating casts will need to be added at the top.
          Hide
          jcamachorodriguez Jesus Camacho Rodriguez added a comment -

          We have an expression a1 that is simplified to a2. "Nullability matching" means that if a2 is the same type as a1 except for nullability flag, we introduce a CAST. However, if we are not matching nullability, then we will not introduce the CAST, as they are already the same base type.

          Show
          jcamachorodriguez Jesus Camacho Rodriguez added a comment - We have an expression a1 that is simplified to a2 . "Nullability matching" means that if a2 is the same type as a1 except for nullability flag, we introduce a CAST. However, if we are not matching nullability, then we will not introduce the CAST, as they are already the same base type.
          Hide
          julianhyde Julian Hyde added a comment -

          Maybe I'm stupid, but can someone explain "nullability matching" in simple English?

          Show
          julianhyde Julian Hyde added a comment - Maybe I'm stupid, but can someone explain "nullability matching" in simple English?
          Hide
          jcamachorodriguez Jesus Camacho Rodriguez added a comment -

          Julian Hyde, we have been trying to create tests for this issue (ReduceExpressionsRule#FILTER_INSTANCE is disabled by default so we tried to use RelOptRulesTest), but we could not come up with same plan as in Hive. Do you have any other suggestions? slim bouguerra has been testing the rule in Hive with the change his proposed change (HIVE-17871) and everything is working as expected and without issues, so I would like to check in the patch if possible.

          To be clear, the proposed change is quite straightforward: it exposes a flag to match to indicate whether reduce expressions should match nullability or not when expressions are simplified (by default it is true, so current behavior is preserved).

          Show
          jcamachorodriguez Jesus Camacho Rodriguez added a comment - Julian Hyde , we have been trying to create tests for this issue (ReduceExpressionsRule#FILTER_INSTANCE is disabled by default so we tried to use RelOptRulesTest), but we could not come up with same plan as in Hive. Do you have any other suggestions? slim bouguerra has been testing the rule in Hive with the change his proposed change ( HIVE-17871 ) and everything is working as expected and without issues, so I would like to check in the patch if possible. To be clear, the proposed change is quite straightforward: it exposes a flag to match to indicate whether reduce expressions should match nullability or not when expressions are simplified (by default it is true , so current behavior is preserved).
          Hide
          jcamachorodriguez Jesus Camacho Rodriguez added a comment -

          slim bouguerra, I am not sure about why that block is commented out. In any case, you can add tests for a specific rule via core/src/test/java/org/apache/calcite/test/RelOptRulesTest.java / core/src/test/resources/org/apache/calcite/test/RelOptRulesTest.xml. That should be sufficient.

          Show
          jcamachorodriguez Jesus Camacho Rodriguez added a comment - slim bouguerra , I am not sure about why that block is commented out. In any case, you can add tests for a specific rule via core/src/test/java/org/apache/calcite/test/RelOptRulesTest.java / core/src/test/resources/org/apache/calcite/test/RelOptRulesTest.xml . That should be sufficient.
          Hide
          julianhyde Julian Hyde added a comment -

          slim bouguerra, Your communication style is difficult. What does it mean when you post a link to a PR, with no other text, as a comment to a JIRA case? (You did this in this case, CALCITE-2041, and also in CALCITE-2019.) Most people would assume this means that the PR is ready to review and commit.

          You seem to think this is personal. It is not. But I sure do get pissed off when people waste my time, as you did, by posting incomplete PRs without any indication that they were incomplete. I would have been a lot less pissed off if, after the checkstyle issue in CALCITE-2019, you had simply said "Sorry, I'll fix it."

          And now you are accusing me of making personal attacks. What you think that is going to do for our working relationship?

          Show
          julianhyde Julian Hyde added a comment - slim bouguerra , Your communication style is difficult. What does it mean when you post a link to a PR, with no other text, as a comment to a JIRA case? (You did this in this case, CALCITE-2041 , and also in CALCITE-2019 .) Most people would assume this means that the PR is ready to review and commit. You seem to think this is personal. It is not. But I sure do get pissed off when people waste my time, as you did, by posting incomplete PRs without any indication that they were incomplete. I would have been a lot less pissed off if, after the checkstyle issue in CALCITE-2019 , you had simply said "Sorry, I'll fix it." And now you are accusing me of making personal attacks. What you think that is going to do for our working relationship?
          Hide
          bslim slim bouguerra added a comment -

          Trying to add some testing to class

           org.apache.calcite.rel.rules.ReduceExpressionsRule#FILTER_INSTANCE 

          part of

           org.apache.calcite.prepare.CalcitePrepareImpl#CONSTANT_REDUCTION_RULES 

          but looking at the code base i can see that the only use of such rule is excluded via this block at org/apache/calcite/prepare/CalcitePrepareImpl.java:571 git hash 221739354b56e34e9f1d41b42a0e6881a8f5ddee

           
          // Change the below to enable constant-reduction.
              if (false) {
                for (RelOptRule rule : CONSTANT_REDUCTION_RULES) {
                  planner.addRule(rule);
                }
              }
          

          while the comment says it is enabling constant reductions I can not see how this is done since the block is never executed? not sure what is the idea of keeping such code with

           if (false) 

          .
          Not sure what is the best way to test such rule since it is excluded from planer anyway.

          Show
          bslim slim bouguerra added a comment - Trying to add some testing to class org.apache.calcite.rel.rules.ReduceExpressionsRule#FILTER_INSTANCE part of org.apache.calcite.prepare.CalcitePrepareImpl#CONSTANT_REDUCTION_RULES but looking at the code base i can see that the only use of such rule is excluded via this block at org/apache/calcite/prepare/CalcitePrepareImpl.java:571 git hash 221739354b56e34e9f1d41b42a0e6881a8f5ddee // Change the below to enable constant-reduction. if ( false ) { for (RelOptRule rule : CONSTANT_REDUCTION_RULES) { planner.addRule(rule); } } while the comment says it is enabling constant reductions I can not see how this is done since the block is never executed? not sure what is the idea of keeping such code with if ( false ) . Not sure what is the best way to test such rule since it is excluded from planer anyway.
          Hide
          bslim slim bouguerra added a comment -

          Julian Hyde why the discussions have to be always you reminding people how bad they are and how great you are? The fact that my previous PR failed the checkstyle has nothing to deal with this contribution and I haven't written test yet since am not even sure if this is the way to go.

          Show
          bslim slim bouguerra added a comment - Julian Hyde why the discussions have to be always you reminding people how bad they are and how great you are? The fact that my previous PR failed the checkstyle has nothing to deal with this contribution and I haven't written test yet since am not even sure if this is the way to go.
          Hide
          julianhyde Julian Hyde added a comment -

          slim bouguerra, Please make it clear when you consider a pull request "ready", before we spend time fully testing it. This one doesn't have tests; your previous PR failed checkstyle.

          In this case, tests would help a lot. I don't understand from your example what the rule is supposed to do nor not do. Nor can I understand why everything is not NOT NULL in your example.

          Show
          julianhyde Julian Hyde added a comment - slim bouguerra , Please make it clear when you consider a pull request "ready", before we spend time fully testing it. This one doesn't have tests; your previous PR failed checkstyle. In this case, tests would help a lot. I don't understand from your example what the rule is supposed to do nor not do. Nor can I understand why everything is not NOT NULL in your example.
          Show
          bslim slim bouguerra added a comment - https://github.com/apache/calcite/pull/563

            People

            • Assignee:
              julianhyde Julian Hyde
              Reporter:
              bslim slim bouguerra
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development