Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-1062

Allowing SqlOperator to be overridden in validation

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.7.0
    • Component/s: core
    • Labels:
      None

      Description

      Calcite allows function to be overridden at validation step. To be more specific, users can provide their SqlOperatorTable, and, at validation step, their SqlOperatorTable will be called (method: lookupOperatorOverloads) to get a overriding function. However, so far, SqlOperator (e.g., +, - , *, etc.) does not have this mechanism yet.

      Since other systems (e.g., Apache Drill) would have more flexible type-checks for SqlOperator's operands, this mechanism is necessary for those systems to pass through the validation step.

        Activity

        Hide
        seanhychu Sean Hsuan-Yi Chu added a comment - - edited

        Pull request:
        https://github.com/apache/calcite/pull/188

        Julian Hyde can you help review ? Thanks.

        Show
        seanhychu Sean Hsuan-Yi Chu added a comment - - edited Pull request: https://github.com/apache/calcite/pull/188 Julian Hyde can you help review ? Thanks.
        Hide
        julianhyde Julian Hyde added a comment -

        Did you consider creating a new implementation of SqlOperatorTable that would "mask" an underlying operator table?

        Show
        julianhyde Julian Hyde added a comment - Did you consider creating a new implementation of SqlOperatorTable that would "mask" an underlying operator table?
        Hide
        seanhychu Sean Hsuan-Yi Chu added a comment -

        Something similar:

        Whenever a SqlOperator is passed to lookupOperatorOverloads(), we will return back a "wrapped/masked" SqlOperator which defines customized type check logic.

        Show
        seanhychu Sean Hsuan-Yi Chu added a comment - Something similar: Whenever a SqlOperator is passed to lookupOperatorOverloads(), we will return back a "wrapped/masked" SqlOperator which defines customized type check logic.
        Hide
        seanhychu Sean Hsuan-Yi Chu added a comment -

        On the other hand, I added a unit test to ensure that if SqlStdOperatorTable is used, Calcite will never falsely override a given operator.

        Show
        seanhychu Sean Hsuan-Yi Chu added a comment - On the other hand, I added a unit test to ensure that if SqlStdOperatorTable is used, Calcite will never falsely override a given operator.
        Hide
        julianhyde Julian Hyde added a comment -

        But why add all this new code? Anyone can supply a custom implementation of SqlOperatorTable and use it in the validator. That already works, AFAIK.

        Show
        julianhyde Julian Hyde added a comment - But why add all this new code? Anyone can supply a custom implementation of SqlOperatorTable and use it in the validator. That already works, AFAIK.
        Hide
        seanhychu Sean Hsuan-Yi Chu added a comment -

        That is true for "SqlFunction" only. You could compare the bodies of SqlFunction.deriveType() and SqlOperator.deriveType().

        While SqlFunction.deriveType() looks up an overriding operator from SqlOperatorTable, SqlOperator.deriveType() does not.

        Show
        seanhychu Sean Hsuan-Yi Chu added a comment - That is true for "SqlFunction" only. You could compare the bodies of SqlFunction.deriveType() and SqlOperator.deriveType(). While SqlFunction.deriveType() looks up an overriding operator from SqlOperatorTable, SqlOperator.deriveType() does not.
        Hide
        julianhyde Julian Hyde added a comment -

        How about extending the capabilities of SqlOperatorTable so that it can resolve overloading in operators not just functions?

        Show
        julianhyde Julian Hyde added a comment - How about extending the capabilities of SqlOperatorTable so that it can resolve overloading in operators not just functions?
        Hide
        seanhychu Sean Hsuan-Yi Chu added a comment -

        Not sure of what you mean regarding extending SqlOperatorTable.

        During the process of SqlOperator.deriveType(), the SqlOperatorTable (regardless of whether it is default or customized one) would not be used in the current logic.

        Show
        seanhychu Sean Hsuan-Yi Chu added a comment - Not sure of what you mean regarding extending SqlOperatorTable. During the process of SqlOperator.deriveType(), the SqlOperatorTable (regardless of whether it is default or customized one) would not be used in the current logic.
        Hide
        seanhychu Sean Hsuan-Yi Chu added a comment -

        Hi Julian Hyde, would you think it make sense? Thanks.

        Show
        seanhychu Sean Hsuan-Yi Chu added a comment - Hi Julian Hyde , would you think it make sense? Thanks.
        Hide
        julianhyde Julian Hyde added a comment -
        • I no longer think it needs to be in SqlOperatorTable. But I think you should unify lookupSqlOperator with lookupRoutine. Generalize lookupRoutine to handle SqlOperator, rather than copy-pasting. You could add a type parameter "<T extends SqlOperator>" if you like.
        • Would it be possible to move SqlFunction.deriveType up to SqlOperator? The methods are essentially doing the same thing.
        • Use StringBuilder not StringBuffer. In fact for that particular case, just use +
        • Use plural name for collections: rename operatorType to operatorTypes.
        • Remove the 'toUpperCase' hack. The core is case-sensitive.
        • Any reason your test cannot be in the base class SqlOperatorBaseTest? It should be obvious that test cases shouldn't be defined in CalciteSqlOperatorTest.
        • Use assertThat, not assertEquals
        Show
        julianhyde Julian Hyde added a comment - I no longer think it needs to be in SqlOperatorTable. But I think you should unify lookupSqlOperator with lookupRoutine. Generalize lookupRoutine to handle SqlOperator, rather than copy-pasting. You could add a type parameter "<T extends SqlOperator>" if you like. Would it be possible to move SqlFunction.deriveType up to SqlOperator? The methods are essentially doing the same thing. Use StringBuilder not StringBuffer. In fact for that particular case, just use + Use plural name for collections: rename operatorType to operatorTypes. Remove the 'toUpperCase' hack. The core is case-sensitive. Any reason your test cannot be in the base class SqlOperatorBaseTest? It should be obvious that test cases shouldn't be defined in CalciteSqlOperatorTest. Use assertThat, not assertEquals
        Hide
        seanhychu Sean Hsuan-Yi Chu added a comment -

        Two questions as below:
        1. Regarding the 2nd item, there is some difference in these two. For example, SqlFunction has the concept of User-Defined-Functions while the other does not. However, I could do some consolidation for sure.

        2. Regarding the 5th item, inside ReflectiveSqlOperatorTable.lookupOperatorOverloads() (see line 114-115), Calcite converts String to uppercase before lookup.
        This implies Calcite assumes all operators/functions are registered with uppercase. Thus, if an operator/function's name is not all in uppercase when being registered, Calcite cannot find it when doing lookup. The uppercase() seems needed when we register the operators/functions.

        [1] https://github.com/apache/calcite/blob/4c7f5c20a04b4a4e736a16f801d8b5e6eded48cc/core/src/main/java/org/apache/calcite/sql/util/ReflectiveSqlOperatorTable.java#L114

        Show
        seanhychu Sean Hsuan-Yi Chu added a comment - Two questions as below: 1. Regarding the 2nd item, there is some difference in these two. For example, SqlFunction has the concept of User-Defined-Functions while the other does not. However, I could do some consolidation for sure. 2. Regarding the 5th item, inside ReflectiveSqlOperatorTable.lookupOperatorOverloads() (see line 114-115), Calcite converts String to uppercase before lookup. This implies Calcite assumes all operators/functions are registered with uppercase. Thus, if an operator/function's name is not all in uppercase when being registered, Calcite cannot find it when doing lookup. The uppercase() seems needed when we register the operators/functions. [1] https://github.com/apache/calcite/blob/4c7f5c20a04b4a4e736a16f801d8b5e6eded48cc/core/src/main/java/org/apache/calcite/sql/util/ReflectiveSqlOperatorTable.java#L114
        Hide
        seanhychu Sean Hsuan-Yi Chu added a comment -

        Update the pull request. You could please review at your convenience. Thanks.

        Show
        seanhychu Sean Hsuan-Yi Chu added a comment - Update the pull request. You could please review at your convenience. Thanks.
        Hide
        julianhyde Julian Hyde added a comment -

        I took a quick look, looks good. I'm going to test it and do some fix up (I think we should have a SqlKind value for ROW_NUMBER, and I'd like to convert the various filterOperatorXxx methods to use Iterators.filter) but I think your PR is sound.

        Show
        julianhyde Julian Hyde added a comment - I took a quick look, looks good. I'm going to test it and do some fix up (I think we should have a SqlKind value for ROW_NUMBER, and I'd like to convert the various filterOperatorXxx methods to use Iterators.filter) but I think your PR is sound.
        Hide
        julianhyde Julian Hyde added a comment -

        Sean Hsuan-Yi Chu, I am seeing a stack overflow in SqlValidatorTestCase.testLarge. Do you see that?

        Show
        julianhyde Julian Hyde added a comment - Sean Hsuan-Yi Chu , I am seeing a stack overflow in SqlValidatorTestCase.testLarge. Do you see that?
        Hide
        julianhyde Julian Hyde added a comment -

        Ah, I found it. The stack got deeper because SqlOperator.deriveType no longer calls deriveType on each operand. It is a bit more work to call deriveType early, but keeps the stack smaller, so I restored it.

        I also discovered CALCITE-1083, which was causing an O(n ^ 2) slow-down (but not affecting stack-size).

        Show
        julianhyde Julian Hyde added a comment - Ah, I found it. The stack got deeper because SqlOperator.deriveType no longer calls deriveType on each operand. It is a bit more work to call deriveType early, but keeps the stack smaller, so I restored it. I also discovered CALCITE-1083 , which was causing an O(n ^ 2) slow-down (but not affecting stack-size).
        Hide
        seanhychu Sean Hsuan-Yi Chu added a comment -

        I see. Thanks for helping point out that issue!

        Show
        seanhychu Sean Hsuan-Yi Chu added a comment - I see. Thanks for helping point out that issue!
        Show
        julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/c1ceba45 and http://git-wip-us.apache.org/repos/asf/calcite/commit/72b2cfb7 . Thanks for the PR, Sean Hsuan-Yi Chu !
        Hide
        julianhyde Julian Hyde added a comment -

        Resolved in release 1.7.0 (2016-03-22).

        Show
        julianhyde Julian Hyde added a comment - Resolved in release 1.7.0 (2016-03-22).

          People

          • Assignee:
            seanhychu Sean Hsuan-Yi Chu
            Reporter:
            seanhychu Sean Hsuan-Yi Chu
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development