Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-3224

New RexNode-to-Expression CodeGen Implementation

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 1.20.0
    • 1.24.0
    • core

    Description

      Background

          Current RexNode-to-Expression implementation relies on BlockBuilder's incorrect “optimizations” to inline unsafe operations. As illustrated in CALCITE-3173, when this cooperation is broken in some special cases, it will cause exceptions like NPE, such as CALCITE-3142, CALCITE-3143, CALCITE-3150.

          Though we can fix these problems under current implementation framework with some efforts like the PR in CALCITE-3142, the logic will become more and more complex. To pursue a thorough and elegant solution, we implement a new one. Moreover, it also ensures the correctness for non-optimized code.

      Major Features

      • Visitor Pattern: Each RexNode will be visited only once in a bottom-up way, rather than recursively visiting a RexNode many times with different NullAs settings.
      • Conditional Semantic: It can naturally guarantee the correctness even without BlockBuilder’s “optimizings”. Each line of code generated for a RexNode is null safe.
      • Interface Compatibility: The implementation only updates RexToLixTranslator and RexImpTable. Interfaces such as CallImplementor keep unchanged.

      Implementation

          For each RexNode, the visitor will generally generate two declaration statements, one for value and one for nullable. The code snippet is like:

      {valueVariable} = {valueExpression}
      
      {isNullVariable} = {isNullExpression}
      

      The visitor’s result will be the variable pair (isNullVariable, valueVariable).

      Other changes:

      (1) ReImplement different RexCall implementations (e.g., CastImplementor, BinaryImplementor and etc.) as seperated files and remove them into the newly created package org.apache.calcite.adapter.enumerable.rex, and organize them in RexCallImpTable.

      (2) move some util functions into EnumUtils.

      Example Demonstration

      Take a simple test case as example, in which the "commission" column is nullable.

      @Test public void testNPE() {
        CalciteAssert.hr()
          .query("select \"commission\" + 10 as s\n"
            + "from \"hr\".\"emps\"")
          .returns("S=1010\nS=510\nS=null\nS=260\n");
      }
      

      The codegen progress and non-optimized code are demonstrated in the figure below.

      1. When visiting RexInputRef (commission), the visitor generates three lines of code, the result is a pair of ParameterExpression (input_isNull, input_value).
      2. Then the visitor visits RexLiteral (10) and generates two lines of code. The result is (literal_isNull, literal_value).
      3. After that, when visiting RexCall(Add), (input_isNull, input_value) and (literal_isNull, literal_value) can be used to implement the logic. The visitor also generates two lines of code and return the variable pair.

      In the end, the result Expression is constructed based on (binary_call_isNull, binary_call_value)

      RexNode-CodeGen.pdf

      Attachments

        1. RexNode-CodeGen.pdf
          857 kB
          Feng Zhu
        2. codegen.png
          39 kB
          Feng Zhu

        Issue Links

          Activity

            People

              donnyzone Feng Zhu
              donnyzone Feng Zhu
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 6h 40m
                  6h 40m