Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-5309

Accept cursors as inputs of table functions

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      The TableFunctionTest.testTableFunctionCursorsInputs is currently disabled and a note says that it could not be implemented. This feature is necessary to implement table functions that take a query or a table as an input.

      Attachments

        Issue Links

          Activity

            bchapuis Bertil Chapuis added a comment - - edited

            I spent a some time digging into this issue and here is a summary of my understanding of the issue:

            The parser implements support for cursors and produces valid AST for the query in TableFunctionTest.testTableFunctionCursorInputs.
             

            select n from table(s.process(2, cursor(select * from table(s.GenerateStrings(?))) )) as t(u) where u > 3

            The conversion of the AST to a logical plan fails, but this can be fixed by adding support for cursors in SqlTypeAssignementRule and JavaToSqlTypeConversionRules (see [draft PR|https://github.com/apache/calcite/pull/3157/files]).

            LogicalProject(U=[$0]): rowcount = 1.0, cumulative cost = {Infinity rows, Infinity cpu, Infinity io}, id = 9
              LogicalFilter(condition=[>($0, 3)]): rowcount = 1.0, cumulative cost = {Infinity rows, Infinity cpu, Infinity io}, id = 8
                LogicalTableFunctionScan(invocation=[process(2, CAST($0):CURSOR NOT NULL)], rowType=[RecordType(INTEGER result)], elementType=[class [Ljava.lang.Object;]): rowcount = 1.0, cumulative cost = {Infinity rows, Infinity cpu, Infinity io}, id = 7
                  LogicalProject(n=[$0], s=[$1]): rowcount = 1.0, cumulative cost = {1.7976931348623157E308 rows, 1.7976931348623157E308 cpu, 1.7976931348623157E308 io}, id = 6
                    LogicalTableFunctionScan(invocation=[GenerateStrings(?0)], rowType=[JavaType(class org.apache.calcite.util.Smalls$IntString)], elementType=[class org.apache.calcite.util.Smalls$IntString]): rowcount = 1.0, cumulative cost = {huge}, id = 0
             

            This logical plan is not supported by the enumerable adaptor and produces the following exception.

            Error while preparing statement [select *
            from table("s"."process"(2,
            cursor(select * from table("s"."GenerateStrings"(?)))
            )) as t(u)
            where u > 3]
            java.sql.SQLException: Error while preparing statement [select *
            from table("s"."process"(2,
            cursor(select * from table("s"."GenerateStrings"(?)))
            )) as t(u)
            where u > 3]
            
            ...
            
            Caused by: java.lang.IllegalStateException: Unable to implement EnumerableCalc(expr#0=[{inputs}], expr#1=[3], expr#2=[>($t0, $t1)], result=[$t0], $condition=[$t2]): rowcount = 1.0, cumulative cost = {3.0 rows, 7.0 cpu, 0.0 io}, id = 47
              EnumerableTableFunctionScan(invocation=[process(2, CAST($0):CURSOR NOT NULL)], rowType=[RecordType(INTEGER result)], elementType=[class [Ljava.lang.Object;]): rowcount = 1.0, cumulative cost = {2.0 rows, 2.0 cpu, 0.0 io}, id = 43
                EnumerableTableFunctionScan(invocation=[GenerateStrings(?0)], rowType=[JavaType(class org.apache.calcite.util.Smalls$IntString)], elementType=[class org.apache.calcite.util.Smalls$IntString]): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io}, id = 35
            
            ...
            
            Suppressed: java.lang.NullPointerException: inputGetter
                    at java.base/java.util.Objects.requireNonNull(Objects.java:233)
                    at org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitInputRef(RexToLixTranslator.java:1029)
                    at org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitInputRef(RexToLixTranslator.java:101)
                    at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:113)
            
            ...

            In my understanding, a cursor should be a relational expression (i.e. a RelNode). Therefore, the next step should be to add a Cursor class to the o.a.c.rel.core package, a EnumerableCursor class to the o.a.c.adapter.enumarable package, and glue things together.

            julianhyde is this intuition correct? Thank you for your help, I try to figure out the high level steps needed to address this issue.

             

            bchapuis Bertil Chapuis added a comment - - edited I spent a some time digging into this issue and here is a summary of my understanding of the issue: The parser implements support for cursors and produces valid AST for the query in TableFunctionTest.testTableFunctionCursorInputs.   select n from table(s.process(2, cursor(select * from table(s.GenerateStrings(?))) )) as t(u) where u > 3 The conversion of the AST to a logical plan fails, but this can be fixed by adding support for cursors in SqlTypeAssignementRule and JavaToSqlTypeConversionRules (see [draft PR| https://github.com/apache/calcite/pull/3157/files ]). LogicalProject(U=[$0]): rowcount = 1.0, cumulative cost = {Infinity rows, Infinity cpu, Infinity io}, id = 9   LogicalFilter(condition=[>($0, 3)]): rowcount = 1.0, cumulative cost = {Infinity rows, Infinity cpu, Infinity io}, id = 8     LogicalTableFunctionScan(invocation=[process(2, CAST($0):CURSOR NOT NULL)], rowType=[RecordType(INTEGER result)], elementType=[class [Ljava.lang. Object ;]): rowcount = 1.0, cumulative cost = {Infinity rows, Infinity cpu, Infinity io}, id = 7       LogicalProject(n=[$0], s=[$1]): rowcount = 1.0, cumulative cost = {1.7976931348623157E308 rows, 1.7976931348623157E308 cpu, 1.7976931348623157E308 io}, id = 6         LogicalTableFunctionScan(invocation=[GenerateStrings(?0)], rowType=[JavaType( class org.apache.calcite.util.Smalls$IntString)], elementType=[ class org.apache.calcite.util.Smalls$IntString]): rowcount = 1.0, cumulative cost = {huge}, id = 0 This logical plan is not supported by the enumerable adaptor and produces the following exception. Error while preparing statement [select * from table( "s" . "process" (2, cursor(select * from table( "s" . "GenerateStrings" (?))) )) as t(u) where u > 3] java.sql.SQLException: Error while preparing statement [select * from table( "s" . "process" (2, cursor(select * from table( "s" . "GenerateStrings" (?))) )) as t(u) where u > 3] ... Caused by: java.lang.IllegalStateException: Unable to implement EnumerableCalc(expr#0=[{inputs}], expr#1=[3], expr#2=[>($t0, $t1)], result=[$t0], $condition=[$t2]): rowcount = 1.0, cumulative cost = {3.0 rows, 7.0 cpu, 0.0 io}, id = 47   EnumerableTableFunctionScan(invocation=[process(2, CAST($0):CURSOR NOT NULL)], rowType=[RecordType(INTEGER result)], elementType=[class [Ljava.lang. Object ;]): rowcount = 1.0, cumulative cost = {2.0 rows, 2.0 cpu, 0.0 io}, id = 43     EnumerableTableFunctionScan(invocation=[GenerateStrings(?0)], rowType=[JavaType( class org.apache.calcite.util.Smalls$IntString)], elementType=[ class org.apache.calcite.util.Smalls$IntString]): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io}, id = 35 ... Suppressed: java.lang.NullPointerException: inputGetter         at java.base/java.util.Objects.requireNonNull(Objects.java:233)         at org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitInputRef(RexToLixTranslator.java:1029)         at org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitInputRef(RexToLixTranslator.java:101)         at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:113) ... In my understanding, a cursor should be a relational expression (i.e. a RelNode). Therefore, the next step should be to add a Cursor class to the o.a.c.rel.core package, a EnumerableCursor class to the o.a.c.adapter.enumarable package, and glue things together. julianhyde is this intuition correct? Thank you for your help, I try to figure out the high level steps needed to address this issue.  
            julianhyde Julian Hyde added a comment -

            Many thanks for doing this research. I agree with everything except that Cursor should be a new type of relational expression. I think cursor is just a use of a relational expression, so any existing relational expression will do.

            I expect there will need to be a method something like 'RelNode implementCursor(int)'.

            (Why is the cursor concept needed in the SQL syntax and the AST but not needed in the relational algebra? I'm not totally sure, but I think it has something to do with namespaces. Relational algebra doesn't need namespaces, because each relational expression just refers to its inputs. The Spool operator - see CALCITE-481 - deals with the outlier case that referring to an input is not sufficient.)

            julianhyde Julian Hyde added a comment - Many thanks for doing this research. I agree with everything except that Cursor should be a new type of relational expression. I think cursor is just a use of a relational expression, so any existing relational expression will do. I expect there will need to be a method something like 'RelNode implementCursor(int)'. (Why is the cursor concept needed in the SQL syntax and the AST but not needed in the relational algebra? I'm not totally sure, but I think it has something to do with namespaces. Relational algebra doesn't need namespaces, because each relational expression just refers to its inputs. The Spool operator - see CALCITE-481 - deals with the outlier case that referring to an input is not sufficient.)
            bchapuis Bertil Chapuis added a comment - - edited

            Thanks a lot for your answer, it helps a lot.

            Could you elaborate a little bit more on the method you mention (i.e., 'RelNode implementCursor(int)')? More precisely:

            • I'm not sure to understand the 'int' parameter. I would expect the input of the cursor to be a sub query (e.g., 'cursor(select * from table(s.GenerateStrings(?)))'). In my previous attempts at implementing cursors, I tried to implement the cursor with a method that takes an Enumerable<Object[]> as a parameter.
            • Is there a specific location in the codebase (package or class) where you expect this method to be located?
            bchapuis Bertil Chapuis added a comment - - edited Thanks a lot for your answer, it helps a lot. Could you elaborate a little bit more on the method you mention (i.e., 'RelNode implementCursor(int)')? More precisely: I'm not sure to understand the 'int' parameter. I would expect the input of the cursor to be a sub query (e.g., 'cursor(select * from table(s.GenerateStrings(?)))'). In my previous attempts at implementing cursors, I tried to implement the cursor with a method that takes an Enumerable<Object[]> as a parameter. Is there a specific location in the codebase (package or class) where you expect this method to be located?

            People

              bchapuis Bertil Chapuis
              bchapuis Bertil Chapuis
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: