Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-26638

Replace in-house CBO reduce expressions rules with Calcite's built-in classes




      The goal of this ticket is to remove Hive specific code in HiveReduceExpressionsRule and use exclusively the respective Calcite classes (i.e., ReduceExpressionsRule) to reduce maintenance overhead and facilitate code evolution.

      Currently the only difference between in-house (HiveReduceExpressionsRule) and built-in (ReduceExpressionsRule) reduce expressions rules lies in the way we treat the Filter operator (i.e., FilterReduceExpressionsRule).

      There are four differences when comparing the in-house code with the respective part in Calcite 1.25.0 that are Hive specific.

      Match nullability when reducing expressions
      When we reduce filters we always set matchNullability (last parameter) to false.

      if (reduceExpressions(filter, expList, predicates, true, false)) {

      This means that the original and reduced expression can have a slightly different type in terms of nullability; the original is nullable and the reduced is not nullable. When the value is true the type can be preserved by adding a "nullability" CAST, which is a cast to the same type which differs only to if it is nullable or not.

      Hardcoding matchNullability to false was done as part of the upgrade in Calcite 1.15.0 (HIVE-18068) where the behavior of the rule became configurable (CALCITE-2041).

      Remove nullability cast explicitly
      When the expression is reduced we try to remove the nullability cast; if there is one.

      if (RexUtil.isNullabilityCast(filter.getCluster().getTypeFactory(), newConditionExp)) {
      	newConditionExp = ((RexCall) newConditionExp).getOperands().get(0);

      The code was added as part of the upgrade to Calcite 1.10.0 (HIVE-13316). However, the code is redundant as of HIVE-18068; setting matchNullability to false no longer generates nullability casts during the reduction.

      Avoid creating filters with condition of type NULL

      if(newConditionExp.getType().getSqlTypeName() == SqlTypeName.NULL) {
      	newConditionExp = call.builder().cast(newConditionExp, SqlTypeName.BOOLEAN);

      Hive tries to cast such expressions to BOOLEAN to avoid the weird (and possibly problematic) situation of having a condition with NULL type.

      In Calcite, there is specific code for detecting if the new condition is the NULL literal (with NULL type) and if that's the case it turns the relation to empty.

      } else if (newConditionExp instanceof RexLiteral
        || RexUtil.isNullLiteral(newConditionExp, true)) {
      call.transformTo(createEmptyRelOrEquivalent(call, filter));

      Due to that the Hive specific code is redundant if the Calcite rule is used.

      Bail out when input to reduceNotNullableFilter is not a RexCall

      if (!(rexCall.getOperands().get(0) instanceof RexCall)) {
            // If child is not a RexCall instance, we can bail out

      The code was added as part of the upgrade to Calcite 1.10.0 (HIVE-13316) but it does not add any functional value.
      The instanceof check is redundant since the code in reduceNotNullableFilter is a noop when the expression/call is not one of the following: IS_NULL, IS_UNKNOWN, IS_NOT_NULL, which are all rex calls.


      All of the Hive specific changes mentioned previously can be safely replaced by appropriate uses of the Calcite APIs without affecting the behavior of CBO.


        Issue Links



              zabetak Stamatis Zampetakis
              zabetak Stamatis Zampetakis
              0 Vote for this issue
              2 Start watching this issue



                Time Tracking

                  Original Estimate - Not Specified
                  Not Specified
                  Remaining Estimate - 0h
                  Time Spent - 20m