Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-6468

RelDecorrelator throws AssertionError if correlated variable is used as Aggregate group key

    XMLWordPrintableJSON

Details

    Description

      The problem can be reproduced with this query (a "simplified" version of TPC-DS query1):

      WITH agg_sal AS
        (SELECT deptno, sum(sal) AS total FROM emp GROUP BY deptno)
      SELECT 1 FROM agg_sal s1
      WHERE s1.total > (SELECT avg(total) FROM agg_sal s2 WHERE s1.deptno = s2.deptno)
      

      If we apply subquery program, FilterAggregateTransposeRule and then we call the RelDecorrelator, it will fail with:

      java.lang.AssertionError
      	at org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:581)
      	at org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:495)
      	...
      

      The problem appears in this assert (RelDecorrelator.java:581):

      assert newPos == newInputOutput.size();
      

      The root cause seems to be that, a few lines before, when processing the correlating variables from corDefOutputs a certain value is inserted in mapNewInputToProjOutputs:

      if (!frame.corDefOutputs.isEmpty()) {
        for (Map.Entry<CorDef, Integer> entry : frame.corDefOutputs.entrySet()) {
          RexInputRef.add2(projects, entry.getValue(), newInputOutput);
          corDefOutputs.put(entry.getKey(), newPos);
          mapNewInputToProjOutputs.put(entry.getValue(), newPos); // <-- HERE
          newPos++;
        }
      }
      

      The problem is that this value was already in the map, as it had been inserted previously as part of the group key processing:

      for (int i = 0; i < oldGroupKeyCount; i++) {
        final int idx = groupKeyIndices.get(i);
        ...
        // add mapping of group keys.
        outputMap.put(idx, newPos);
        int newInputPos = requireNonNull(frame.oldToNewOutputs.get(idx));
        RexInputRef.add2(projects, newInputPos, newInputOutput);
        mapNewInputToProjOutputs.put(newInputPos, newPos); // <-- HERE added firstly
        newPos++;
      }
      

      Therefore, the unnecessary insertion into mapNewInputToProjOutputs and the subsequent increment of newPos when the {{CorDef}}s are processed leads to the mismatch.

      Notice how, right before the assertion, when processing the remaining fields, it is verified that the value is not already contained on the mapNewInputToProjOutputs:

      // add the remaining fields
      final int newGroupKeyCount = newPos;
      for (int i = 0; i < newInputOutput.size(); i++) {
        if (!mapNewInputToProjOutputs.containsKey(i)) { // <-- HERE checked
          RexInputRef.add2(projects, i, newInputOutput);
          mapNewInputToProjOutputs.put(i, newPos);
          newPos++;
        }
      }
      

      Thus, probably the solution would be to apply the same logic when the CorDef are processed:

      if (!frame.corDefOutputs.isEmpty()) {
        for (Map.Entry<CorDef, Integer> entry : frame.corDefOutputs.entrySet()) {
          final Integer pos = mapNewInputToProjOutputs.get(entry.getValue()); // <-- HERE add map verification
          if (pos == null) {
            RexInputRef.add2(projects, entry.getValue(), newInputOutput);
            corDefOutputs.put(entry.getKey(), newPos);
            mapNewInputToProjOutputs.put(entry.getValue(), newPos);
            newPos++;
          } else {
            corDefOutputs.put(entry.getKey(), pos);
          }
        }
      }
      

      Attachments

        Issue Links

          Activity

            People

              rubenql Ruben Q L
              rubenql Ruben Q L
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: