Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-4419

Posix regex operators cannot be used within RelBuilder

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Posix regex operators are special, they are defined as binary (public class SqlPosixRegexOperator extends SqlBinaryOperator), but their actual implementation takes three arguments, the third being a caseSensitive boolean flag (SqlFunctions#posixRegex(String s, String regex, Boolean caseSensitive)).
      Using them in SQL works fine. The magic happens in SqlPosixRegexOperator#createCall, which adds an extra boolean parameter (the caseSensitive flag) into the call:

      @Override public SqlCall createCall(SqlLiteral functionQualifier, SqlParserPos pos, SqlNode... operands) {
        pos = pos.plusAll(Arrays.asList(operands));
        operands = Arrays.copyOf(operands, operands.length + 1);
        operands[operands.length - 1] = SqlLiteral.createBoolean(caseSensitive, SqlParserPos.ZERO);
        return new SqlBasicCall(this, operands, pos, false, functionQualifier);
      }
      

      However, if we try to use them in a plan created by RelBuilder, problems occur.

      1) The following plan:

      RelBuilder builder = ...
      builder
        .scan("s", "emps")
        .filter(
          builder.call(
            SqlStdOperatorTable.POSIX_REGEX_CASE_SENSITIVE,
            builder.field("name"),
            builder.literal("E...")))
      ...
      

      Fails because it cannot find the implementor method

      Caused by: java.lang.IllegalStateException: Unable to implement EnumerableCalc(expr#0..4=[{inputs}], expr#5=['E...'], expr#6=[POSIX REGEX CASE SENSITIVE($t2, $t5)]...
      ...
      Caused by: java.lang.NoSuchMethodException: org.apache.calcite.runtime.SqlFunctions.posixRegex(java.lang.String, java.lang.String)
      

      2) The solution seems either adding a (somehow redundant, but necessary due to posix regex operators design) third boolean parameter

        .filter(
          builder.call(
            SqlStdOperatorTable.POSIX_REGEX_CASE_SENSITIVE,
            builder.field("name"),
            builder.literal("E..."),
            builder.literal(true)))
      

      or maybe add this parameter transparently inside RelBuilder#call code:

        private @Nonnull RexCall call(SqlOperator operator, List<RexNode> operandList) {
          switch (operator.getKind()) {
          ...
          case POSIX_REGEX_CASE_SENSITIVE:
            if (operandList.size() == 2) {
              return (RexCall) call(operator, operandList.get(0), operandList.get(1), literal(true));
            }
            break;
          case POSIX_REGEX_CASE_INSENSITIVE:
            if (operandList.size() == 2) {
              return (RexCall) call(operator, operandList.get(0), operandList.get(1), literal(false));
            }
            break;
          ...
      

      Either way, it will now fail validating the operands:

      wrong operand count 3 for POSIX REGEX CASE SENSITIVE
      java.lang.AssertionError: wrong operand count 3 for POSIX REGEX CASE SENSITIVE
      	at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
      	at org.apache.calcite.sql.SqlBinaryOperator.validRexOperands(SqlBinaryOperator.java:200)
      

      3) So, it would seem SqlPosixRegexOperator should override validRexOperands to support 2 and 3 parameters (which in fact would seem correct and aligned to the current overridden behavior of SqlPosixRegexOperator#getOperandCountRange and SqlPosixRegexOperator#checkOperandTypes:

        @Override public boolean validRexOperands(int count, Litmus litmus) {
          if (count != 2 && count != 3) {
            return litmus.fail("wrong operand count {} for {}", count, this);
          }
          return litmus.succeed();
        }
      

      But now the test fails with almost the same message as the very first one:

      Caused by: java.lang.IllegalStateException: Unable to implement EnumerableCalc(expr#0..4=[{inputs}], expr#5=['E...'], expr#6=[true], expr#7=[POSIX REGEX CASE SENSITIVE($t2, $t5, $t6)]...
      ...
      Caused by: java.lang.NoSuchMethodException: org.apache.calcite.runtime.SqlFunctions.posixRegex(java.lang.String, java.lang.String, boolean)
      

      4) The problem seems to be a mismatch between our third argument (boolean) and the implementor's third argument (Boolean). Looking at the code, it seems clear that a primitive boolean should / must be used since having a null Boolean object makes no sense (and will cause a NPE anyway). So if we change it:

      // SqlFunctions.java
      ...
      public static boolean posixRegex(String s, String regex, boolean caseSensitive) // instead of Boolean
      { ... }
      
      // ---------
      
      // BuiltInMethod.java
      ...
      POSIX_REGEX(SqlFunctions.class, "posixRegex", String.class, String.class, boolean.class), // instead of Boolean.class
      

      Now our test finally succeeds.
      Therefore, I propose to implement these sequence of changes in order to solve this issue.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            rubenql Ruben Q L
            rubenql Ruben Q L
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 20m
                20m

                Slack

                  Issue deployment