The TableFunctionTest.testTableFunctionCursorsInputs is currently disabled and a note says that it could not be implemented. This feature is necessary to implement table functions that take a query or a table as an input.
I spent a some time digging into this issue and here is a summary of my understanding of the issue:
The parser implements support for cursors and produces valid AST for the query in TableFunctionTest.testTableFunctionCursorInputs.
select n from table(s.process(2, cursor(select * from table(s.GenerateStrings(?))) )) as t(u) where u > 3
The conversion of the AST to a logical plan fails, but this can be fixed by adding support for cursors in SqlTypeAssignementRule and JavaToSqlTypeConversionRules (see [draft PR|https://github.com/apache/calcite/pull/3157/files]).
This logical plan is not supported by the enumerable adaptor and produces the following exception.
Error while preparing statement [select *
from table("s"."process"(2,
cursor(select * from table("s"."GenerateStrings"(?)))
)) as t(u)
where u > 3]
java.sql.SQLException: Error while preparing statement [select *
from table("s"."process"(2,
cursor(select * from table("s"."GenerateStrings"(?)))
)) as t(u)
where u > 3]
...
Caused by: java.lang.IllegalStateException: Unable to implement EnumerableCalc(expr#0=[{inputs}], expr#1=[3], expr#2=[>($t0, $t1)], result=[$t0], $condition=[$t2]): rowcount = 1.0, cumulative cost = {3.0 rows, 7.0 cpu, 0.0 io}, id = 47
EnumerableTableFunctionScan(invocation=[process(2, CAST($0):CURSOR NOT NULL)], rowType=[RecordType(INTEGER result)], elementType=[class [Ljava.lang.Object;]): rowcount = 1.0, cumulative cost = {2.0 rows, 2.0 cpu, 0.0 io}, id = 43
EnumerableTableFunctionScan(invocation=[GenerateStrings(?0)], rowType=[JavaType(class org.apache.calcite.util.Smalls$IntString)], elementType=[class org.apache.calcite.util.Smalls$IntString]): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io}, id = 35
...
Suppressed: java.lang.NullPointerException: inputGetter
at java.base/java.util.Objects.requireNonNull(Objects.java:233)
at org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitInputRef(RexToLixTranslator.java:1029)
at org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitInputRef(RexToLixTranslator.java:101)
at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:113)
...
In my understanding, a cursor should be a relational expression (i.e. a RelNode). Therefore, the next step should be to add a Cursor class to the o.a.c.rel.core package, a EnumerableCursor class to the o.a.c.adapter.enumarable package, and glue things together.
julianhyde is this intuition correct? Thank you for your help, I try to figure out the high level steps needed to address this issue.
Bertil Chapuis
added a comment - - edited I spent a some time digging into this issue and here is a summary of my understanding of the issue:
The parser implements support for cursors and produces valid AST for the query in TableFunctionTest.testTableFunctionCursorInputs.
select n from table(s.process(2, cursor(select * from table(s.GenerateStrings(?))) )) as t(u) where u > 3
The conversion of the AST to a logical plan fails, but this can be fixed by adding support for cursors in SqlTypeAssignementRule and JavaToSqlTypeConversionRules (see [draft PR| https://github.com/apache/calcite/pull/3157/files ]).
LogicalProject(U=[$0]): rowcount = 1.0, cumulative cost = {Infinity rows, Infinity cpu, Infinity io}, id = 9
LogicalFilter(condition=[>($0, 3)]): rowcount = 1.0, cumulative cost = {Infinity rows, Infinity cpu, Infinity io}, id = 8
LogicalTableFunctionScan(invocation=[process(2, CAST($0):CURSOR NOT NULL)], rowType=[RecordType(INTEGER result)], elementType=[class [Ljava.lang. Object ;]): rowcount = 1.0, cumulative cost = {Infinity rows, Infinity cpu, Infinity io}, id = 7
LogicalProject(n=[$0], s=[$1]): rowcount = 1.0, cumulative cost = {1.7976931348623157E308 rows, 1.7976931348623157E308 cpu, 1.7976931348623157E308 io}, id = 6
LogicalTableFunctionScan(invocation=[GenerateStrings(?0)], rowType=[JavaType( class org.apache.calcite.util.Smalls$IntString)], elementType=[ class org.apache.calcite.util.Smalls$IntString]): rowcount = 1.0, cumulative cost = {huge}, id = 0
This logical plan is not supported by the enumerable adaptor and produces the following exception.
Error while preparing statement [select *
from table( "s" . "process" (2,
cursor(select * from table( "s" . "GenerateStrings" (?)))
)) as t(u)
where u > 3]
java.sql.SQLException: Error while preparing statement [select *
from table( "s" . "process" (2,
cursor(select * from table( "s" . "GenerateStrings" (?)))
)) as t(u)
where u > 3]
...
Caused by: java.lang.IllegalStateException: Unable to implement EnumerableCalc(expr#0=[{inputs}], expr#1=[3], expr#2=[>($t0, $t1)], result=[$t0], $condition=[$t2]): rowcount = 1.0, cumulative cost = {3.0 rows, 7.0 cpu, 0.0 io}, id = 47
EnumerableTableFunctionScan(invocation=[process(2, CAST($0):CURSOR NOT NULL)], rowType=[RecordType(INTEGER result)], elementType=[class [Ljava.lang. Object ;]): rowcount = 1.0, cumulative cost = {2.0 rows, 2.0 cpu, 0.0 io}, id = 43
EnumerableTableFunctionScan(invocation=[GenerateStrings(?0)], rowType=[JavaType( class org.apache.calcite.util.Smalls$IntString)], elementType=[ class org.apache.calcite.util.Smalls$IntString]): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io}, id = 35
...
Suppressed: java.lang.NullPointerException: inputGetter
at java.base/java.util.Objects.requireNonNull(Objects.java:233)
at org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitInputRef(RexToLixTranslator.java:1029)
at org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitInputRef(RexToLixTranslator.java:101)
at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:113)
...
In my understanding, a cursor should be a relational expression (i.e. a RelNode). Therefore, the next step should be to add a Cursor class to the o.a.c.rel.core package, a EnumerableCursor class to the o.a.c.adapter.enumarable package, and glue things together.
julianhyde is this intuition correct? Thank you for your help, I try to figure out the high level steps needed to address this issue.
Many thanks for doing this research. I agree with everything except that Cursor should be a new type of relational expression. I think cursor is just a use of a relational expression, so any existing relational expression will do.
I expect there will need to be a method something like 'RelNode implementCursor(int)'.
(Why is the cursor concept needed in the SQL syntax and the AST but not needed in the relational algebra? I'm not totally sure, but I think it has something to do with namespaces. Relational algebra doesn't need namespaces, because each relational expression just refers to its inputs. The Spool operator - see CALCITE-481 - deals with the outlier case that referring to an input is not sufficient.)
Julian Hyde
added a comment - Many thanks for doing this research. I agree with everything except that Cursor should be a new type of relational expression. I think cursor is just a use of a relational expression, so any existing relational expression will do.
I expect there will need to be a method something like 'RelNode implementCursor(int)'.
(Why is the cursor concept needed in the SQL syntax and the AST but not needed in the relational algebra? I'm not totally sure, but I think it has something to do with namespaces. Relational algebra doesn't need namespaces, because each relational expression just refers to its inputs. The Spool operator - see CALCITE-481 - deals with the outlier case that referring to an input is not sufficient.)
Could you elaborate a little bit more on the method you mention (i.e., 'RelNode implementCursor(int)')? More precisely:
I'm not sure to understand the 'int' parameter. I would expect the input of the cursor to be a sub query (e.g., 'cursor(select * from table(s.GenerateStrings(?)))'). In my previous attempts at implementing cursors, I tried to implement the cursor with a method that takes an Enumerable<Object[]> as a parameter.
Is there a specific location in the codebase (package or class) where you expect this method to be located?
Bertil Chapuis
added a comment - - edited Thanks a lot for your answer, it helps a lot.
Could you elaborate a little bit more on the method you mention (i.e., 'RelNode implementCursor(int)')? More precisely:
I'm not sure to understand the 'int' parameter. I would expect the input of the cursor to be a sub query (e.g., 'cursor(select * from table(s.GenerateStrings(?)))'). In my previous attempts at implementing cursors, I tried to implement the cursor with a method that takes an Enumerable<Object[]> as a parameter.
Is there a specific location in the codebase (package or class) where you expect this method to be located?
I spent a some time digging into this issue and here is a summary of my understanding of the issue:
The parser implements support for cursors and produces valid AST for the query in TableFunctionTest.testTableFunctionCursorInputs.
The conversion of the AST to a logical plan fails, but this can be fixed by adding support for cursors in SqlTypeAssignementRule and JavaToSqlTypeConversionRules (see [draft PR|https://github.com/apache/calcite/pull/3157/files]).
This logical plan is not supported by the enumerable adaptor and produces the following exception.
In my understanding, a cursor should be a relational expression (i.e. a RelNode). Therefore, the next step should be to add a Cursor class to the o.a.c.rel.core package, a EnumerableCursor class to the o.a.c.adapter.enumarable package, and glue things together.
julianhyde is this intuition correct? Thank you for your help, I try to figure out the high level steps needed to address this issue.