Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-4300

EnumerableBatchNestedLoopJoin dynamic code generation can lead to variable name issues if two EBNLJ are nested

    XMLWordPrintableJSON

Details

    Description

      EnumerableBatchNestedLoopJoin#implement method defines a variable named corrList in the dynamic code (which will store the correlating variables of the EBNLJ operator). Under certain circumstances (virtually impossible to reproduce on Calcite core, but feasible on downstream projects with further optimizations like IndexScan where the two batches of correlating variables can be "pushed"), this variable naming can lead to issues if two EBNLJ are nested:

      /*   5 */   final com.onwbp.org.apache.calcite.linq4j.Enumerable _inputEnumerable = com.onwbp.org.apache.calcite.linq4j.EnumerableDefaults.correlateBatchJoin(..., ..., new com.onwbp.org.apache.calcite.linq4j.function.Function1() {
      /*   6 */     public com.onwbp.org.apache.calcite.linq4j.AbstractEnumerable apply(final java.util.List corrList) { // corrList1
      /*   7 */       {
      ...
      /*  11 */         final com.onwbp.org.apache.calcite.linq4j.Enumerable _inputEnumerable = com.onwbp.org.apache.calcite.linq4j.EnumerableDefaults.correlateBatchJoin(..., ..., new com.onwbp.org.apache.calcite.linq4j.function.Function1() {
      /*  12 */           public com.onwbp.org.apache.calcite.linq4j.Enumerable apply(final java.util.List corrList) { // corrList2
      /*  13 */             {
      ...
      /*  16 */                 myContext.putCorrelatingValue("$cor10.0", ((Object[]) corrList.get(0))[0]); // here it refers to corrList1, problem!
      /*  17 */                 myContext.putCorrelatingValue("$cor11.0", ((Object[]) corrList.get(1))[0]); // here it refers to corrList1, problem!
      /*  18 */                 myContext.putCorrelatingValue("$cor34.0", (String) corrList.get(0)); // here it refers to corrList2, works by chance
      /*  19 */                 myContext.putCorrelatingValue("$cor35.0", (String) corrList.get(1)); // here it refers to corrList2, works by chance
      .
      

      Notice how dynamic code involves two "corrList" (lines 6 and 12); however when they are referenced (lines 16-19), the second one is always used, since they share the same name.
      The fix is simple, each EnumerableBatchNestedLoopJoin must guarantee a unique name for its corrList in the dynamic code.

      Attachments

        Issue Links

          Activity

            People

              rubenql Ruben Q L
              rubenql Ruben Q L
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m