Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
The goal of this ticket is to remove Hive specific code in HiveReduceExpressionsRule and use exclusively the respective Calcite classes (i.e., ReduceExpressionsRule) to reduce maintenance overhead and facilitate code evolution.
Currently the only difference between in-house (HiveReduceExpressionsRule) and built-in (ReduceExpressionsRule) reduce expressions rules lies in the way we treat the Filter operator (i.e., FilterReduceExpressionsRule).
There are four differences when comparing the in-house code with the respective part in Calcite 1.25.0 that are Hive specific.
Match nullability when reducing expressions
When we reduce filters we always set matchNullability (last parameter) to false.
if (reduceExpressions(filter, expList, predicates, true, false)) {
This means that the original and reduced expression can have a slightly different type in terms of nullability; the original is nullable and the reduced is not nullable. When the value is true the type can be preserved by adding a "nullability" CAST, which is a cast to the same type which differs only to if it is nullable or not.
Hardcoding matchNullability to false was done as part of the upgrade in Calcite 1.15.0 (HIVE-18068) where the behavior of the rule became configurable (CALCITE-2041).
Remove nullability cast explicitly
When the expression is reduced we try to remove the nullability cast; if there is one.
if (RexUtil.isNullabilityCast(filter.getCluster().getTypeFactory(), newConditionExp)) {
newConditionExp = ((RexCall) newConditionExp).getOperands().get(0);
}
The code was added as part of the upgrade to Calcite 1.10.0 (HIVE-13316). However, the code is redundant as of HIVE-18068; setting matchNullability to false no longer generates nullability casts during the reduction.
Avoid creating filters with condition of type NULL
if(newConditionExp.getType().getSqlTypeName() == SqlTypeName.NULL) { newConditionExp = call.builder().cast(newConditionExp, SqlTypeName.BOOLEAN); }
Hive tries to cast such expressions to BOOLEAN to avoid the weird (and possibly problematic) situation of having a condition with NULL type.
In Calcite, there is specific code for detecting if the new condition is the NULL literal (with NULL type) and if that's the case it turns the relation to empty.
} else if (newConditionExp instanceof RexLiteral || RexUtil.isNullLiteral(newConditionExp, true)) { call.transformTo(createEmptyRelOrEquivalent(call, filter));
Due to that the Hive specific code is redundant if the Calcite rule is used.
Bail out when input to reduceNotNullableFilter is not a RexCall
if (!(rexCall.getOperands().get(0) instanceof RexCall)) { // If child is not a RexCall instance, we can bail out return; }
The code was added as part of the upgrade to Calcite 1.10.0 (HIVE-13316) but it does not add any functional value.
The instanceof check is redundant since the code in reduceNotNullableFilter is a noop when the expression/call is not one of the following: IS_NULL, IS_UNKNOWN, IS_NOT_NULL, which are all rex calls.
Summary
All of the Hive specific changes mentioned previously can be safely replaced by appropriate uses of the Calcite APIs without affecting the behavior of CBO.
Attachments
Issue Links
- relates to
-
CALCITE-2041 When ReduceExpressionRule simplifies a nullable expression, allow the result to change type to NOT NULL
- Closed
- links to