Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
4.0.0
-
None
Description
Calcite "explodes" IN clauses into the equivalent OR form, and therefore it does not handle such clauses in most of the codebase (notably in RexSimplify).
In Hive, the same happens, but HivePointLookupOptimizerRule re-introduces IN clauses, and it happens in applyPreJoinOrderingTransforms phase, which is pretty early and which mixes several other rules which might not fully support IN (notably, HiveReduceExpressionsRule which is based on RexSimplify).
The problem will become even harder in later versions of Calcite (current is 1.25) based on SARG, which does not support IN clauses.
IN clauses can be converted into efficient runtime operators, we therefore want to keep them in the final plan, intuitively we just want this translation to happen in a later step, in order to leave the rest of the codebase (Hive and Calcite) unaware of IN clauses.
The goal of the ticket is as follows:
- re-convert the output expression of HivePointLookupOptimizerRule into the OR form (keep the logic as-is to benefit from the rule)
- add a rule, in the last step of the planning process, that only converts eligible OR expressions into IN clauses
Attachments
Issue Links
- relates to
-
HIVE-25870 Avoid simplification in HivePointLookupOptimizerRule, convert only
- Open
- links to