Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-4414

RelMdSelectivity#getSelectivity for Calc can propagate a predicate with wrong references

    XMLWordPrintableJSON

Details

    Description

      RelMdSelectivity#getSelectivity(Calc rel, RelMetadataQuery mq, RexNode predicate) method:

        public Double getSelectivity(Calc rel, RelMetadataQuery mq, RexNode predicate) {
          final RexProgram rexProgram = rel.getProgram();
          final RexLocalRef programCondition = rexProgram.getCondition();
          if (programCondition == null) {
            return getSelectivity(rel.getInput(), mq, predicate); // [2]
          } else {
            // [1]
            return mq.getSelectivity(rel.getInput(),
                RelMdUtil.minusPreds(
                    rel.getCluster().getRexBuilder(),
                    predicate,
                    rexProgram.expandLocalRef(programCondition)));
          }
        }
      

      currently passes down the predicate to its input [1] without considering any possible translation, since the predicate might include expressions generated by the Calc's projections; hence when the Calc's input analyzes the predicate, it can end up trying to access fields that do not exist on its rowType.

      This can lead to unforeseeable consequences, like the test attached to the first comment, where after RelMdSelectivity#getSelectivity(Calc) we reach RelMdSelectivity#getSelectivity(Union) and this method ends up in an ArrayIndexOutOfBoundsException because it tries to access a field ($1) that does not exists on its rowType (which only has $0). This $1 is actually projected by the Calc which is on top of the Union.

      Note I: in a similar situation, RelM uses{{ RelOptUtil.pushPastProject}} (which "Converts an expression that is based on the output fields of a Project to an equivalent expression on the Project's input fields.") to convert the predicate before passing it to the Project's input.

      Note II: in the code snipped above that in our test example the issue only happens in line [1], and not in [2] because the "if" block calls getSelectivity instead of mq.getSelectivity, although I find this a bit questionable and maybe mq.getSelectivity should be called here as well.

      Attachments

        Issue Links

          Activity

            People

              rubenql Ruben Q L
              rubenql Ruben Q L
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m