Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-25852

Introduce IN clauses at the very end of query planning

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.0.0
    • None
    • CBO

    Description

      Calcite "explodes" IN clauses into the equivalent OR form, and therefore it does not handle such clauses in most of the codebase (notably in RexSimplify).

      In Hive, the same happens, but HivePointLookupOptimizerRule re-introduces IN clauses, and it happens in applyPreJoinOrderingTransforms phase, which is pretty early and which mixes several other rules which might not fully support IN (notably, HiveReduceExpressionsRule which is based on RexSimplify).

      The problem will become even harder in later versions of Calcite (current is 1.25) based on SARG, which does not support IN clauses.

      IN clauses can be converted into efficient runtime operators, we therefore want to keep them in the final plan, intuitively we just want this translation to happen in a later step, in order to leave the rest of the codebase (Hive and Calcite) unaware of IN clauses.

      The goal of the ticket is as follows:

      1. re-convert the output expression of HivePointLookupOptimizerRule into the OR form (keep the logic as-is to benefit from the rule)
      2. add a rule, in the last step of the planning process, that only converts eligible OR expressions into IN clauses

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              asolimando Alessandro Solimando
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m