Uploaded image for project: 'Derby'
  1. Derby
  2. DERBY-5554

NullPointerException in generated VTI code

    Details

    • Urgency:
      Normal
    • Issue & fix info:
      Patch Available, Release Note Needed, Repro attached
    • Bug behavior facts:
      Crash

      Description

      With the latest 10.8.2.2 binary distribution, the following short script produces a NullPointerException:

      java -Dij.exceptionTrace=true org.apache.derby.tools.ij
      ij> connect 'jdbc:derby:brydb;create=true';
      ij> create table t1 (a int);
      ij> SELECT T2., systabs., syscgs.conglomeratenumber
      FROM
      SYS.SYSTABLES systabs, sys.sysconglomerates syscgs,
      TABLE (SYSCS_DIAG.SPACE_TABLE(systabs.tablename)) AS T2
      WHERE systabs.tabletype = 'T' and systabs.tableid = syscgs.tableid;

      The exception trace is pasted below:

      ERROR 38000: The exception 'java.lang.NullPointerException' was thrown while evaluating an expression.
      java.sql.SQLException: The exception 'java.lang.NullPointerException' was thrown while evaluating an expression.
      at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source)
      at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source)
      at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source)
      at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)
      at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source)
      at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
      at org.apache.derby.impl.tools.ij.ij.executeImmediate(Unknown Source)
      at org.apache.derby.impl.tools.ij.utilMain.doCatch(Unknown Source)
      at org.apache.derby.impl.tools.ij.utilMain.runScriptGuts(Unknown Source)
      at org.apache.derby.impl.tools.ij.utilMain.go(Unknown Source)
      at org.apache.derby.impl.tools.ij.Main.go(Unknown Source)
      at org.apache.derby.impl.tools.ij.Main.mainCore(Unknown Source)
      at org.apache.derby.impl.tools.ij.Main.main(Unknown Source)
      at org.apache.derby.tools.ij.main(Unknown Source)
      Caused by: java.sql.SQLException: The exception 'java.lang.NullPointerException' was thrown while evaluating an expression.
      at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
      at org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown Source)
      ... 18 more
      Caused by: java.sql.SQLException: Java exception: ': java.lang.NullPointerException'.
      at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
      at org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown Source)
      at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source)
      at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source)
      at org.apache.derby.impl.jdbc.Util.javaException(Unknown Source)
      at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)
      ... 15 more
      Caused by: java.lang.NullPointerException
      at org.apache.derby.exe.acf81e0010x0134x6972x0511x0000033820000.g0(Unknown Source)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:616)
      at org.apache.derby.impl.services.reflect.ReflectMethod.invoke(Unknown Source)
      at org.apache.derby.impl.sql.execute.VTIResultSet.openCore(Unknown Source)
      at org.apache.derby.impl.sql.execute.JoinResultSet.openRight(Unknown Source)
      at org.apache.derby.impl.sql.execute.JoinResultSet.openCore(Unknown Source)
      at org.apache.derby.impl.sql.execute.JoinResultSet.openCore(Unknown Source)
      at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.openCore(Unknown Source)
      at org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.open(Unknown Source)
      at org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(Unknown Source)
      at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown Source)
      ... 11 more
      ERROR XJ001: Java exception: ': java.lang.NullPointerException'.
      java.sql.SQLException: Java exception: ': java.lang.NullPointerException'.
      at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source)
      at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source)
      at org.apache.derby.impl.jdbc.Util.javaException(Unknown Source)
      at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)
      at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)
      at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source)
      at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
      at org.apache.derby.impl.tools.ij.ij.executeImmediate(Unknown Source)
      at org.apache.derby.impl.tools.ij.utilMain.doCatch(Unknown Source)
      at org.apache.derby.impl.tools.ij.utilMain.runScriptGuts(Unknown Source)
      at org.apache.derby.impl.tools.ij.utilMain.go(Unknown Source)
      at org.apache.derby.impl.tools.ij.Main.go(Unknown Source)
      at org.apache.derby.impl.tools.ij.Main.mainCore(Unknown Source)
      at org.apache.derby.impl.tools.ij.Main.main(Unknown Source)
      at org.apache.derby.tools.ij.main(Unknown Source)
      Caused by: java.sql.SQLException: Java exception: ': java.lang.NullPointerException'.
      at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
      at org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown Source)
      ... 19 more
      Caused by: java.lang.NullPointerException
      at org.apache.derby.exe.acf81e0010x0134x6972x0511x0000033820000.g0(Unknown Source)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:616)
      at org.apache.derby.impl.services.reflect.ReflectMethod.invoke(Unknown Source)
      at org.apache.derby.impl.sql.execute.VTIResultSet.openCore(Unknown Source)
      at org.apache.derby.impl.sql.execute.JoinResultSet.openRight(Unknown Source)
      at org.apache.derby.impl.sql.execute.JoinResultSet.openCore(Unknown Source)
      at org.apache.derby.impl.sql.execute.JoinResultSet.openCore(Unknown Source)
      at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.openCore(Unknown Source)
      at org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.open(Unknown Source)
      at org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(Unknown Source)
      at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown Source)
      ... 11 more

      1. derby-5554-01-aa-addTableIDcolumn.diff
        11 kB
        Rick Hillegas
      2. derby-5554-01-ab-addTableIDcolumn.diff
        10 kB
        Rick Hillegas
      3. derby-5554-02-aa-0argConstructor.diff
        16 kB
        Rick Hillegas
      4. derby-5554-03-aa-forbidVTItoVTIjoins.diff
        2 kB
        Rick Hillegas
      5. derby-5554-04-aa-forceVTIAfterJoin.diff
        4 kB
        Rick Hillegas
      6. derby-5554-04-ab-remapVTIarg.diff
        6 kB
        Rick Hillegas
      7. derby-5554-04-ac-remapVTIarg.diff
        5 kB
        Rick Hillegas
      8. releaseNote.html
        2 kB
        Rick Hillegas
      9. releaseNote.html
        2 kB
        Rick Hillegas

        Issue Links

          Activity

          Hide
          knutanders Knut Anders Hatlen added a comment -

          Seen all the way back to 10.3.1.4 (earlier releases didn't support the syntax - DERBY-2152).

          Show
          knutanders Knut Anders Hatlen added a comment - Seen all the way back to 10.3.1.4 (earlier releases didn't support the syntax - DERBY-2152 ).
          Hide
          rhillegas Rick Hillegas added a comment -

          The failing query looks very similar to the sample query for SYSCS_DIAG.SPACE_TABLE found in the Reference Manual section titled "SYSCS_DIAG diagnostic tables and functions":

          SELECT T2.*
          FROM
          SYS.SYSTABLES systabs,
          TABLE (SYSCS_DIAG.SPACE_TABLE(systabs.tablename)) AS T2
          WHERE systabs.tabletype = 'T'

          I confess that I don't understand the meaning of either query. When we implemented user-defined table functions, we specified that the arguments to the table function should be expressions built from the following:

          1) literals
          2) ? parameters
          3) correlated references to variables in outer query blocks

          That is, the arguments were supposed to be constant within the query block. We specifically did not want table function arguments to be variables from other tables in the FROM list. However, we don't enforce this limitation. You get a similar NPE if you replace the SYSCS_DIAG.SPACE_TABLE element in the FROM list with an invocation of a user defined table function which takes the SYSTABLES.TABLENAME variable as an argument.

          I'm trying to figure out what these queries mean. My usual approach to understanding a query is to follow these steps:

          A) Compute the cartesian product of the elements in the FROM list.

          B) Then join and restrict by applying the WHERE clause to the cartesian product.

          C) Then apply the other clauses in an order which doesn't affect this discussion.

          With these queries, what is (A)? I don't see a cartesian product here because there is an implicit, undefined join in the FROM list.

          My gut feeling is that these may be illegal queries, but I haven't found the relevant chapter and verse in the SQL Standard. I would be interested in other people's theories about the meaning of these queries and how you thread your way through the SQL Standard in order to determine that meaning.

          Thanks,
          -Rick

          Show
          rhillegas Rick Hillegas added a comment - The failing query looks very similar to the sample query for SYSCS_DIAG.SPACE_TABLE found in the Reference Manual section titled "SYSCS_DIAG diagnostic tables and functions": SELECT T2.* FROM SYS.SYSTABLES systabs, TABLE (SYSCS_DIAG.SPACE_TABLE(systabs.tablename)) AS T2 WHERE systabs.tabletype = 'T' I confess that I don't understand the meaning of either query. When we implemented user-defined table functions, we specified that the arguments to the table function should be expressions built from the following: 1) literals 2) ? parameters 3) correlated references to variables in outer query blocks That is, the arguments were supposed to be constant within the query block. We specifically did not want table function arguments to be variables from other tables in the FROM list. However, we don't enforce this limitation. You get a similar NPE if you replace the SYSCS_DIAG.SPACE_TABLE element in the FROM list with an invocation of a user defined table function which takes the SYSTABLES.TABLENAME variable as an argument. I'm trying to figure out what these queries mean. My usual approach to understanding a query is to follow these steps: A) Compute the cartesian product of the elements in the FROM list. B) Then join and restrict by applying the WHERE clause to the cartesian product. C) Then apply the other clauses in an order which doesn't affect this discussion. With these queries, what is (A)? I don't see a cartesian product here because there is an implicit, undefined join in the FROM list. My gut feeling is that these may be illegal queries, but I haven't found the relevant chapter and verse in the SQL Standard. I would be interested in other people's theories about the meaning of these queries and how you thread your way through the SQL Standard in order to determine that meaning. Thanks, -Rick
          Hide
          dagw Dag H. Wanvik added a comment -

          I think the intention is clear: you want a union of tables resulting from subtables of the form TABLE (SYSCS_DIAG.SPACE_TABLE(<name of a system table>), where <name of a system table> would range over all the system tables of type 'T'. Now, how would a correct/better query for this look? It would need to employ correlated references to variables in outer query blocks, since literal and ? are not applicable here...

          Show
          dagw Dag H. Wanvik added a comment - I think the intention is clear: you want a union of tables resulting from subtables of the form TABLE (SYSCS_DIAG.SPACE_TABLE(<name of a system table>), where <name of a system table> would range over all the system tables of type 'T'. Now, how would a correct/better query for this look? It would need to employ correlated references to variables in outer query blocks, since literal and ? are not applicable here...
          Hide
          knutanders Knut Anders Hatlen added a comment -

          The query passes if I add an optimizer override to force the table function to be the inner table in the join (which makes some sense, since otherwise the argument to the table function would not have a known value):

          SELECT T2., systabs., syscgs.conglomeratenumber
          FROM --DERBY-PROPERTIES joinOrder=fixed
          SYS.SYSTABLES systabs, sys.sysconglomerates syscgs,
          TABLE (SYSCS_DIAG.SPACE_TABLE(systabs.tablename)) AS T2
          WHERE systabs.tabletype = 'T' and systabs.tableid = syscgs.tableid;

          If I reverse the join order, I see this error:

          ij> SELECT T2., systabs., syscgs.conglomeratenumber
          > FROM --DERBY-PROPERTIES joinOrder=fixed
          > TABLE (SYSCS_DIAG.SPACE_TABLE(systabs.tablename)) AS T2, sys.sysconglomerates syscgs, SYS.SYSTABLES systabs
          > WHERE systabs.tabletype = 'T' and systabs.tableid = syscgs.tableid;
          ERROR 42Y70: The user specified an illegal join order. This could be caused by a join column from an inner table being passed as a parameter to an external virtual table.

          The error message implies that the original query was supposed to work, I think, but the optimizer for some reason didn't enforce the correct ordering.

          Or maybe the optimizer does enforce the requirement that the table function call must be the inner table relative to the table from which it takes its arguments, but the query only works if the table function call is the innermost table? For example, it does accept the following forced join order, in which systabs.tablename should be known when SPACE_TABLE is called, but it still fails with a NullPointerException during execution:

          SELECT T2., systabs., syscgs.conglomeratenumber
          FROM --DERBY-PROPERTIES joinOrder=fixed
          SYS.SYSTABLES systabs,
          TABLE (SYSCS_DIAG.SPACE_TABLE(systabs.tablename)) AS T2,
          sys.sysconglomerates syscgs
          WHERE systabs.tabletype = 'T' and systabs.tableid = syscgs.tableid;

          Show
          knutanders Knut Anders Hatlen added a comment - The query passes if I add an optimizer override to force the table function to be the inner table in the join (which makes some sense, since otherwise the argument to the table function would not have a known value): SELECT T2. , systabs. , syscgs.conglomeratenumber FROM --DERBY-PROPERTIES joinOrder=fixed SYS.SYSTABLES systabs, sys.sysconglomerates syscgs, TABLE (SYSCS_DIAG.SPACE_TABLE(systabs.tablename)) AS T2 WHERE systabs.tabletype = 'T' and systabs.tableid = syscgs.tableid; If I reverse the join order, I see this error: ij> SELECT T2. , systabs. , syscgs.conglomeratenumber > FROM --DERBY-PROPERTIES joinOrder=fixed > TABLE (SYSCS_DIAG.SPACE_TABLE(systabs.tablename)) AS T2, sys.sysconglomerates syscgs, SYS.SYSTABLES systabs > WHERE systabs.tabletype = 'T' and systabs.tableid = syscgs.tableid; ERROR 42Y70: The user specified an illegal join order. This could be caused by a join column from an inner table being passed as a parameter to an external virtual table. The error message implies that the original query was supposed to work, I think, but the optimizer for some reason didn't enforce the correct ordering. Or maybe the optimizer does enforce the requirement that the table function call must be the inner table relative to the table from which it takes its arguments, but the query only works if the table function call is the innermost table? For example, it does accept the following forced join order, in which systabs.tablename should be known when SPACE_TABLE is called, but it still fails with a NullPointerException during execution: SELECT T2. , systabs. , syscgs.conglomeratenumber FROM --DERBY-PROPERTIES joinOrder=fixed SYS.SYSTABLES systabs, TABLE (SYSCS_DIAG.SPACE_TABLE(systabs.tablename)) AS T2, sys.sysconglomerates syscgs WHERE systabs.tabletype = 'T' and systabs.tableid = syscgs.tableid;
          Hide
          rhillegas Rick Hillegas added a comment -

          Note that Derby does not allow subqueries in the FROM list to join to other tables in the query block. This behavior of Derby seems correct to me. At the very least it is odd that you can get around this limitation by wrapping the query in a table function.

          ij> connect 'jdbc:derby:memory:db;create=true';
          ij> create table t1 ( a int );
          0 rows inserted/updated/deleted
          ij> create table t2( b int );
          0 rows inserted/updated/deleted
          ij> select *
          from t1 t, ( select * from t2 ) s
          where s.b > t.a;
          A |B
          -----------------------

          0 rows selected
          ij> select *
          from t1 t
          where exists ( select * from t2 where b > t.a );
          A
          -----------

          0 rows selected
          ij> select *
          from t1 t, ( select * from t2 where b > t.a ) s;
          ERROR 42X04: Column 'T.A' is either not in any table in the FROM list or appears within a join specification and is outside the scope of the join specification or appears in a HAVING clause and is not in the GROUP BY list. If this is a CREATE or ALTER TABLE statement then 'T.A' is not a column in the target table.

          Show
          rhillegas Rick Hillegas added a comment - Note that Derby does not allow subqueries in the FROM list to join to other tables in the query block. This behavior of Derby seems correct to me. At the very least it is odd that you can get around this limitation by wrapping the query in a table function. ij> connect 'jdbc:derby:memory:db;create=true'; ij> create table t1 ( a int ); 0 rows inserted/updated/deleted ij> create table t2( b int ); 0 rows inserted/updated/deleted ij> select * from t1 t, ( select * from t2 ) s where s.b > t.a; A |B ----------------------- 0 rows selected ij> select * from t1 t where exists ( select * from t2 where b > t.a ); A ----------- 0 rows selected ij> select * from t1 t, ( select * from t2 where b > t.a ) s; ERROR 42X04: Column 'T.A' is either not in any table in the FROM list or appears within a join specification and is outside the scope of the join specification or appears in a HAVING clause and is not in the GROUP BY list. If this is a CREATE or ALTER TABLE statement then 'T.A' is not a column in the target table.
          Hide
          rhillegas Rick Hillegas added a comment -

          I believe that chapter and verse on this syntax can be found in part 2 of the SQL Standard, section 7.6 (<table reference>), Syntax Rule 6.a:

          "If TR is contained in a <from clause> FC with no intervening <query expression>, then the scope clause SC of TR is the <select statement: single row> or innermost <query specification> that contains FC. The scope of a range variable of TR is the <select list>, <where clause>, <group by clause>, <having clause>, and <window clause> of SC, together with every <lateral derived table> that is simply contained in FC and is preceded by TR, and every <collection derived table> that is simply contained in FC and is preceded by TR, and the <join condition> of all <joined table>s contained in SC that contain TR. If SC is the <query specification> that is the <query expression body> of a simple table query STQ, then the scope of a range variable of TR also includes the <order by clause> of STQ."

          When applied to Bryan's original query, this says that the systabs table reference can be mentioned in the query's SELECT list, WHERE clause, GROUP BY clause, HAVING clause, WINDOW clause, and ORDER BY clause. The systabs table reference may be mentioned in the FROM clause only as follows:

          a) in some syntax which Derby does not support (lateral derived tables and collection derived tables)

          b) in the ON clause of a JOIN to another tabular data set.

          The sample SYSCS_DIAG.SPACE_TABLE syntax given in the Reference Manual is illegal and has no meaning.

          I don't see how to write a legal query which expresses Bryan's intent. We could consider adding a TABLEID column to the SPACE_TABLE output and change the vti's arguments so that they accept the % wildcard (like the metadata calls). Then something like this might capture Bryan's intent:

          SELECT T2., systabs., syscgs.conglomeratenumber
          FROM
          SYS.SYSTABLES systabs, sys.sysconglomerates syscgs,
          TABLE (SYSCS_DIAG.SPACE_TABLE( '%' )) AS T2
          WHERE systabs.tabletype = 'T'
          and systabs.tableid = syscgs.tableid
          and systabs.tableid = t2.tableid;

          Show
          rhillegas Rick Hillegas added a comment - I believe that chapter and verse on this syntax can be found in part 2 of the SQL Standard, section 7.6 (<table reference>), Syntax Rule 6.a: "If TR is contained in a <from clause> FC with no intervening <query expression>, then the scope clause SC of TR is the <select statement: single row> or innermost <query specification> that contains FC. The scope of a range variable of TR is the <select list>, <where clause>, <group by clause>, <having clause>, and <window clause> of SC, together with every <lateral derived table> that is simply contained in FC and is preceded by TR, and every <collection derived table> that is simply contained in FC and is preceded by TR, and the <join condition> of all <joined table>s contained in SC that contain TR. If SC is the <query specification> that is the <query expression body> of a simple table query STQ, then the scope of a range variable of TR also includes the <order by clause> of STQ." When applied to Bryan's original query, this says that the systabs table reference can be mentioned in the query's SELECT list, WHERE clause, GROUP BY clause, HAVING clause, WINDOW clause, and ORDER BY clause. The systabs table reference may be mentioned in the FROM clause only as follows: a) in some syntax which Derby does not support (lateral derived tables and collection derived tables) b) in the ON clause of a JOIN to another tabular data set. The sample SYSCS_DIAG.SPACE_TABLE syntax given in the Reference Manual is illegal and has no meaning. I don't see how to write a legal query which expresses Bryan's intent. We could consider adding a TABLEID column to the SPACE_TABLE output and change the vti's arguments so that they accept the % wildcard (like the metadata calls). Then something like this might capture Bryan's intent: SELECT T2. , systabs. , syscgs.conglomeratenumber FROM SYS.SYSTABLES systabs, sys.sysconglomerates syscgs, TABLE (SYSCS_DIAG.SPACE_TABLE( '%' )) AS T2 WHERE systabs.tabletype = 'T' and systabs.tableid = syscgs.tableid and systabs.tableid = t2.tableid;
          Hide
          knutanders Knut Anders Hatlen added a comment -

          Since SPACE_TABLE currently accepts one or two arguments (schema is optional), we could probably also make it to accept zero arguments, and return information for all tables when it's called that way. Then we wouldn't break applications that call their tables %.

          Show
          knutanders Knut Anders Hatlen added a comment - Since SPACE_TABLE currently accepts one or two arguments (schema is optional), we could probably also make it to accept zero arguments, and return information for all tables when it's called that way. Then we wouldn't break applications that call their tables %.
          Hide
          knutanders Knut Anders Hatlen added a comment -

          By the way, here's an assert failure in a similar query (NPE in insane builds):

          ij> select * from sys.systables st, sys.sysconglomerates sc, (select * from table(syscs_diag.space_table(st.tablename)) t2) t3 where st.tabletype='T' and st.tableid=sc.tableid;
          ERROR XJ001: Java exception: 'ASSERT FAILED Found multiple optimizables that share one or more referenced table numbers (esp: '

          {0}

          '), but that should not be the case.: org.apache.derby.shared.common.sanity.AssertFailure'.

          Show
          knutanders Knut Anders Hatlen added a comment - By the way, here's an assert failure in a similar query (NPE in insane builds): ij> select * from sys.systables st, sys.sysconglomerates sc, (select * from table(syscs_diag.space_table(st.tablename)) t2) t3 where st.tabletype='T' and st.tableid=sc.tableid; ERROR XJ001: Java exception: 'ASSERT FAILED Found multiple optimizables that share one or more referenced table numbers (esp: ' {0} '), but that should not be the case.: org.apache.derby.shared.common.sanity.AssertFailure'.
          Hide
          bryanpendleton Bryan Pendleton added a comment -

          Anything that makes it easier to call SPACE_TABLE is fine, in my opinion,
          since that's one of the most commonly called functions in our built-in set.

          Show
          bryanpendleton Bryan Pendleton added a comment - Anything that makes it easier to call SPACE_TABLE is fine, in my opinion, since that's one of the most commonly called functions in our built-in set.
          Hide
          rhillegas Rick Hillegas added a comment -

          Attaching derby-5554-01-aa-addTableIDcolumn.diff. This patch adds a trailing TABLEID column to the SPACE_TABLE vti. This is the first step toward addressing this issue, as outlined in the plan attached to DERBY-5779. I am running regression tests now.

          Touches the following files:

          M java/engine/org/apache/derby/diag/SpaceTable.java
          M java/testing/org/apache/derbyTesting/functionTests/tests/lang/SysDiagVTIMappingTest.java

          Show
          rhillegas Rick Hillegas added a comment - Attaching derby-5554-01-aa-addTableIDcolumn.diff. This patch adds a trailing TABLEID column to the SPACE_TABLE vti. This is the first step toward addressing this issue, as outlined in the plan attached to DERBY-5779 . I am running regression tests now. Touches the following files: M java/engine/org/apache/derby/diag/SpaceTable.java M java/testing/org/apache/derbyTesting/functionTests/tests/lang/SysDiagVTIMappingTest.java
          Hide
          rhillegas Rick Hillegas added a comment -

          Attaching a second rev of the patch: derby-5554-01-ab-addTableIDcolumn.diff. The second rev gets the tableid from a more general location and will work better when we add a no-arg constructor to the vti.

          Tests passed cleanly for me on this rev. Committed at subversion revision 1351714.

          Show
          rhillegas Rick Hillegas added a comment - Attaching a second rev of the patch: derby-5554-01-ab-addTableIDcolumn.diff. The second rev gets the tableid from a more general location and will work better when we add a no-arg constructor to the vti. Tests passed cleanly for me on this rev. Committed at subversion revision 1351714.
          Hide
          rhillegas Rick Hillegas added a comment -

          Attaching derby-5554-02-aa-0argConstructor.diff. This patch adds a 0-arg constructor to SPACE_TABLE. I am running regression tests now.

          The meaning of the 0-arg constructor is "give me information on all conglomerates for all tables". With this patch, the following statement returns space information on all conglomerates:

          select * from table(syscs_diag.space_table()) x
          order by conglomeratename;

          This patch allows us to re-write the example in the Reference Guide section on SPACE_TABLE. The following new syntax is valid standard SQL:

          select t2.*
          from
          sys.systables systabs,
          table (syscs_diag.space_table()) as t2
          where systabs.tabletype = 'T'
          and systabs.tableid = t2.tableid;

          Bryan's query can now be re-written as follows. Note that the join to SYSCONGLOMERATES gives rise to a cartesian product because there is no unique join column between SYSCONGLOMERATES and SPACE_TABLE. This was true of the original query also:

          select t2., systabs., syscgs.conglomeratenumber
          from
          sys.systables systabs, sys.sysconglomerates syscgs,
          table (syscs_diag.space_table()) as t2
          where systabs.tabletype = 'T'
          and systabs.tableid = t2.tableid
          and systabs.tableid = syscgs.tableid;

          Touches the following files:

          ----------

          M java/engine/org/apache/derby/impl/sql/catalog/DataDictionaryImpl.java
          M java/engine/org/apache/derby/iapi/sql/dictionary/DataDictionary.java

          Expands the meaning of getConglomerateDescriptors( UUID ). If the UUID argument is null, this method now returns all of the conglomerates in the database.

          ----------

          M java/engine/org/apache/derby/diag/SpaceTable.java

          Gives SPACE_TABLE a 0-arg construtor which behaves as described above.

          ----------

          M java/testing/org/apache/derbyTesting/functionTests/tests/lang/SysDiagVTIMappingTest.java

          Tests for the new constructor.

          Show
          rhillegas Rick Hillegas added a comment - Attaching derby-5554-02-aa-0argConstructor.diff. This patch adds a 0-arg constructor to SPACE_TABLE. I am running regression tests now. The meaning of the 0-arg constructor is "give me information on all conglomerates for all tables". With this patch, the following statement returns space information on all conglomerates: select * from table(syscs_diag.space_table()) x order by conglomeratename; This patch allows us to re-write the example in the Reference Guide section on SPACE_TABLE. The following new syntax is valid standard SQL: select t2.* from sys.systables systabs, table (syscs_diag.space_table()) as t2 where systabs.tabletype = 'T' and systabs.tableid = t2.tableid; Bryan's query can now be re-written as follows. Note that the join to SYSCONGLOMERATES gives rise to a cartesian product because there is no unique join column between SYSCONGLOMERATES and SPACE_TABLE. This was true of the original query also: select t2. , systabs. , syscgs.conglomeratenumber from sys.systables systabs, sys.sysconglomerates syscgs, table (syscs_diag.space_table()) as t2 where systabs.tabletype = 'T' and systabs.tableid = t2.tableid and systabs.tableid = syscgs.tableid; Touches the following files: ---------- M java/engine/org/apache/derby/impl/sql/catalog/DataDictionaryImpl.java M java/engine/org/apache/derby/iapi/sql/dictionary/DataDictionary.java Expands the meaning of getConglomerateDescriptors( UUID ). If the UUID argument is null, this method now returns all of the conglomerates in the database. ---------- M java/engine/org/apache/derby/diag/SpaceTable.java Gives SPACE_TABLE a 0-arg construtor which behaves as described above. ---------- M java/testing/org/apache/derbyTesting/functionTests/tests/lang/SysDiagVTIMappingTest.java Tests for the new constructor.
          Hide
          rhillegas Rick Hillegas added a comment -

          Tests passed cleanly for me on derby-5554-02-aa-0argConstructor.diff. Committed at subversion revision 1351795.

          Show
          rhillegas Rick Hillegas added a comment - Tests passed cleanly for me on derby-5554-02-aa-0argConstructor.diff. Committed at subversion revision 1351795.
          Hide
          rhillegas Rick Hillegas added a comment -

          Unless someone objects, I am inclined to port 1351714 and 1351795 to the 10.9 and 10.8 branches. That should give quicker relief to people affected by this bug. Because SPACE_TABLE is a vti, no changes were made to catalog data; so it should be ok to port the fixes to release branches. The combined patches change SPACE_TABLE as follows:

          1) The vti now has a trailing TABLEID column.

          2) The vti can now be invoked with a 0-arg constructor.

          Show
          rhillegas Rick Hillegas added a comment - Unless someone objects, I am inclined to port 1351714 and 1351795 to the 10.9 and 10.8 branches. That should give quicker relief to people affected by this bug. Because SPACE_TABLE is a vti, no changes were made to catalog data; so it should be ok to port the fixes to release branches. The combined patches change SPACE_TABLE as follows: 1) The vti now has a trailing TABLEID column. 2) The vti can now be invoked with a 0-arg constructor.
          Hide
          rhillegas Rick Hillegas added a comment -

          Ported 1351714 and 1351795 from trunk to the 10.9 branch at subversion revision 1352648.

          Ported 1351714 and 1351795 from trunk to the 10.8 branch at subversion revision 1352652.

          The port to 10.8 involved a change to SysDiagVTIMappingTest to account for a catalog query which returns different results in 10.8. That is because the SYSUSERS table does not exist in 10.8.

          Show
          rhillegas Rick Hillegas added a comment - Ported 1351714 and 1351795 from trunk to the 10.9 branch at subversion revision 1352648. Ported 1351714 and 1351795 from trunk to the 10.8 branch at subversion revision 1352652. The port to 10.8 involved a change to SysDiagVTIMappingTest to account for a catalog query which returns different results in 10.8. That is because the SYSUSERS table does not exist in 10.8.
          Hide
          rhillegas Rick Hillegas added a comment -

          I have left this issue open because I think the following work should still be done. We can close this issue and open a new issue for the follow-on work if people think that's better:

          1) Formally describe the meaning of the special syntax which we still allow for VTIs but no longer allow for table functions (viz., the joins inside the FROM list).

          2) Fix the optimizer so that query plans are picked which enforce the formal meaning. This is the crux of fixing this issue. Once we fix the optimizer, Bryan's original query should run.

          3) Decide whether we want to document the formal meaning.

          Thanks,
          -Rick

          Show
          rhillegas Rick Hillegas added a comment - I have left this issue open because I think the following work should still be done. We can close this issue and open a new issue for the follow-on work if people think that's better: 1) Formally describe the meaning of the special syntax which we still allow for VTIs but no longer allow for table functions (viz., the joins inside the FROM list). 2) Fix the optimizer so that query plans are picked which enforce the formal meaning. This is the crux of fixing this issue. Once we fix the optimizer, Bryan's original query should run. 3) Decide whether we want to document the formal meaning. Thanks, -Rick
          Hide
          rhillegas Rick Hillegas added a comment -

          Linking this issue to DERBY-2152. That is the issue which introduced parameterized diagnostic VTIs.

          Show
          rhillegas Rick Hillegas added a comment - Linking this issue to DERBY-2152 . That is the issue which introduced parameterized diagnostic VTIs.
          Hide
          rhillegas Rick Hillegas added a comment -

          Linking to DERBY-1520, the issue which documented parameterized diagnostic VTIs. DERBY-2152 does not discuss the meaning of joining parameterized VTIs in the FROM list. The first mention of this capability is on 2007-03-23 on DERBY-1520. Although DERBY-1520 mentions the capability (and the example made its way into the Reference Guide), there is no discussion of the meaning of these joins.

          Show
          rhillegas Rick Hillegas added a comment - Linking to DERBY-1520 , the issue which documented parameterized diagnostic VTIs. DERBY-2152 does not discuss the meaning of joining parameterized VTIs in the FROM list. The first mention of this capability is on 2007-03-23 on DERBY-1520 . Although DERBY-1520 mentions the capability (and the example made its way into the Reference Guide), there is no discussion of the meaning of these joins.
          Hide
          rhillegas Rick Hillegas added a comment -

          There does not seem to be any previous discussion of what it means to join a parameterized diagnostic VTI in the FROM list. DERBY-2152 was a follow-on issue to DERBY-571. It seems to me that the clear intent of those issues was to fit diagnostic VTIs inside the SQL Standard, replacing the Derby-specific constructor syntax. The intent was not to create more syntax outside the SQL Standard. It does not appear that anyone realized the possibility of the illegal join in the FROM list until late in the process of documenting DERBY-2152. It does not appear that anyone at that time was aware that the syntax had no meaning inside the SQL Standard. Certainly there is no discussion of how the syntax violates the Standard and there is no discussion of what the non-standard syntax means.

          This suggests the following:

          1) Without any guidance from the people who introduced this feature, we will have to try to come up with our own meaning.

          2) We should not document the meaning of the non-standard syntax. The point of DERBY-2152 was to steer people away from Derby extensions and encourage people to use standard syntax.

          Show
          rhillegas Rick Hillegas added a comment - There does not seem to be any previous discussion of what it means to join a parameterized diagnostic VTI in the FROM list. DERBY-2152 was a follow-on issue to DERBY-571 . It seems to me that the clear intent of those issues was to fit diagnostic VTIs inside the SQL Standard, replacing the Derby-specific constructor syntax. The intent was not to create more syntax outside the SQL Standard. It does not appear that anyone realized the possibility of the illegal join in the FROM list until late in the process of documenting DERBY-2152 . It does not appear that anyone at that time was aware that the syntax had no meaning inside the SQL Standard. Certainly there is no discussion of how the syntax violates the Standard and there is no discussion of what the non-standard syntax means. This suggests the following: 1) Without any guidance from the people who introduced this feature, we will have to try to come up with our own meaning. 2) We should not document the meaning of the non-standard syntax. The point of DERBY-2152 was to steer people away from Derby extensions and encourage people to use standard syntax.
          Hide
          rhillegas Rick Hillegas added a comment -

          In standard SQL, the order of elements in the FROM list does not affect whether the query compiles. This is only partly true for queries which use the FROM list to join VTIs. In particular, non-standard behavior arises when VTIs join with other VTIs. I have observed the following:

          1) The order of non-VTI and non-table-function elements is not significant.

          2) However, a VTI argument can only join to VTIs/table-functions which precede it in the FROM list.

          I don't imagine that anyone joins our argument-bearing VTIs with one another. Our internal metadata queries don't seem to do this. In addition, the diagnostic VTI columns are not meaningful arguments for other diagnostic VTIs. For this reason, I think that it should be ok to put in some bind-time logic which prevents users from writing queries which join diagnostic VTIs to one another in the FROM list. Removing this useless edge-case makes it easier to define the meaning of the remaining, useful joins.

          The following script demonstrates how you can (and can't) join VTIs to one another in the FROM list today:

          connect 'jdbc:derby:memory:db;create=true';

          select t1.*
          from
          sys.systables systabs,
          table ( syscs_diag.space_table( systabs.tablename ) ) as t1
          where systabs.tabletype = 'T';

          – compiles
          select t1., t2.
          from
          sys.systables systabs,
          table ( syscs_diag.space_table( systabs.tablename ) ) as t1,
          table ( syscs_diag.space_table( t1.conglomeratename ) ) as t2
          where systabs.tabletype = 'T';

          select t1.*
          from
          table ( syscs_diag.space_table( systabs.tablename ) ) as t1,
          sys.systables systabs
          where systabs.tabletype = 'T';

          – does not compile
          select t1., t2.
          from
          sys.systables systabs,
          table ( syscs_diag.space_table( t1.conglomeratename ) ) as t2,
          table ( syscs_diag.space_table( systabs.tablename ) ) as t1
          where systabs.tabletype = 'T';

          Show
          rhillegas Rick Hillegas added a comment - In standard SQL, the order of elements in the FROM list does not affect whether the query compiles. This is only partly true for queries which use the FROM list to join VTIs. In particular, non-standard behavior arises when VTIs join with other VTIs. I have observed the following: 1) The order of non-VTI and non-table-function elements is not significant. 2) However, a VTI argument can only join to VTIs/table-functions which precede it in the FROM list. I don't imagine that anyone joins our argument-bearing VTIs with one another. Our internal metadata queries don't seem to do this. In addition, the diagnostic VTI columns are not meaningful arguments for other diagnostic VTIs. For this reason, I think that it should be ok to put in some bind-time logic which prevents users from writing queries which join diagnostic VTIs to one another in the FROM list. Removing this useless edge-case makes it easier to define the meaning of the remaining, useful joins. The following script demonstrates how you can (and can't) join VTIs to one another in the FROM list today: connect 'jdbc:derby:memory:db;create=true'; select t1.* from sys.systables systabs, table ( syscs_diag.space_table( systabs.tablename ) ) as t1 where systabs.tabletype = 'T'; – compiles select t1. , t2. from sys.systables systabs, table ( syscs_diag.space_table( systabs.tablename ) ) as t1, table ( syscs_diag.space_table( t1.conglomeratename ) ) as t2 where systabs.tabletype = 'T'; select t1.* from table ( syscs_diag.space_table( systabs.tablename ) ) as t1, sys.systables systabs where systabs.tabletype = 'T'; – does not compile select t1. , t2. from sys.systables systabs, table ( syscs_diag.space_table( t1.conglomeratename ) ) as t2, table ( syscs_diag.space_table( systabs.tablename ) ) as t1 where systabs.tabletype = 'T';
          Hide
          rhillegas Rick Hillegas added a comment -

          Attaching derby-5554-03-aa-forbidVTItoVTIjoins.diff. This patch makes it illegal to join two VTIs to one another in the FROM list. Regression tests passed cleanly for me.

          Touches the following files:

          --------

          M java/engine/org/apache/derby/impl/sql/compile/FromVTI.java

          Forbid the illegal join at bind() time.

          --------

          M java/testing/org/apache/derbyTesting/functionTests/tests/lang/SysDiagVTIMappingTest.java

          Add a test case.

          Show
          rhillegas Rick Hillegas added a comment - Attaching derby-5554-03-aa-forbidVTItoVTIjoins.diff. This patch makes it illegal to join two VTIs to one another in the FROM list. Regression tests passed cleanly for me. Touches the following files: -------- M java/engine/org/apache/derby/impl/sql/compile/FromVTI.java Forbid the illegal join at bind() time. -------- M java/testing/org/apache/derbyTesting/functionTests/tests/lang/SysDiagVTIMappingTest.java Add a test case.
          Hide
          kmarsden Kathey Marsden added a comment -

          That seems like a reasonable restriction. It would be good to mention in a release note.

          Show
          kmarsden Kathey Marsden added a comment - That seems like a reasonable restriction. It would be good to mention in a release note.
          Hide
          dagw Dag H. Wanvik added a comment -

          Couldn't we just rearrange the order at bind time (top sort) ? So long as there is no circularity, it seems to me the ordering shouldn't be significant, SQL being set oriented and not imperative...
          (using a dependency relation expressing that a from list element references another in the way used here)

          Show
          dagw Dag H. Wanvik added a comment - Couldn't we just rearrange the order at bind time (top sort) ? So long as there is no circularity, it seems to me the ordering shouldn't be significant, SQL being set oriented and not imperative... (using a dependency relation expressing that a from list element references another in the way used here)
          Hide
          rhillegas Rick Hillegas added a comment -

          Thanks, Kathey. I have marked this issue as needing a release note.

          Thanks for that additional discussion, Dag. I don't see much value in building more internal support for a non-standard syntax whose use we want to discourage. The only use-case I can imagine for joining VTIs in the FROM list would be for future, internal queries involving new diagnostic VTIs. I think that a better approach to that use-case would be to design the new diagnostic VTIs in such a way that joins in the FROM list are not necessary. Thanks.

          Show
          rhillegas Rick Hillegas added a comment - Thanks, Kathey. I have marked this issue as needing a release note. Thanks for that additional discussion, Dag. I don't see much value in building more internal support for a non-standard syntax whose use we want to discourage. The only use-case I can imagine for joining VTIs in the FROM list would be for future, internal queries involving new diagnostic VTIs. I think that a better approach to that use-case would be to design the new diagnostic VTIs in such a way that joins in the FROM list are not necessary. Thanks.
          Hide
          rhillegas Rick Hillegas added a comment -

          Committed derby-5554-03-aa-forbidVTItoVTIjoins.diff at subversion revision 1357692.

          Show
          rhillegas Rick Hillegas added a comment - Committed derby-5554-03-aa-forbidVTItoVTIjoins.diff at subversion revision 1357692.
          Hide
          rhillegas Rick Hillegas added a comment -

          Before defining what it means to join VTIs in the FROM list, it's useful to describe how the clauses in a SQL Standard query are supposed to be evaluated. The following discussion is based on these sections in part 2 of the SQL Standard:

          o 7.13 <query expression> - This section describes a Query as a series of Selects, separated by UNION/INTERSECT operators, sorted by an ORDER BY clause, and bounded by FETCH FIRST and OFFSET clauses.

          o 7.12 <query specification> - This section describes a Select as a SELECT list on top of a <table expression>.

          o 7.4 <table expression> - This section describes the order of the remaining clauses: FROM, WHERE, GROUP BY, HAVING, WINDOW.

          1) Let Q be a query composed of a series of Selects, S1...Sn, separated by UNION/INTERSECT operators, sorted by an ORDER BY clause, and bounded by FETCH FIRST and OFFSET clauses.

          2) We evaluate each Select S as follows:

          F = the cartesian product of all tables, joined tables, views, subqueries, and table functions in the FROM clause.

          W = the result of applying the WHERE clause to F

          G = the result of applying the GROUP BY clause to W

          H = the result of applying the HAVING clause to G

          WW = the result of applying the WINDOW clause to H

          S = the result of applying the SELECT list expressions (including aggregates) to WW

          3) Now we compute

          U = the result of applying the UNION/INTERSECT operators to S1...Sn

          O = the result of applying the ORDER BY clause to U

          Q = the result of applying the FETCH FIRST and OFFSET clauses to O

          Show
          rhillegas Rick Hillegas added a comment - Before defining what it means to join VTIs in the FROM list, it's useful to describe how the clauses in a SQL Standard query are supposed to be evaluated. The following discussion is based on these sections in part 2 of the SQL Standard: o 7.13 <query expression> - This section describes a Query as a series of Selects, separated by UNION/INTERSECT operators, sorted by an ORDER BY clause, and bounded by FETCH FIRST and OFFSET clauses. o 7.12 <query specification> - This section describes a Select as a SELECT list on top of a <table expression>. o 7.4 <table expression> - This section describes the order of the remaining clauses: FROM, WHERE, GROUP BY, HAVING, WINDOW. 1) Let Q be a query composed of a series of Selects, S1...Sn, separated by UNION/INTERSECT operators, sorted by an ORDER BY clause, and bounded by FETCH FIRST and OFFSET clauses. 2) We evaluate each Select S as follows: F = the cartesian product of all tables, joined tables, views, subqueries, and table functions in the FROM clause. W = the result of applying the WHERE clause to F G = the result of applying the GROUP BY clause to W H = the result of applying the HAVING clause to G WW = the result of applying the WINDOW clause to H S = the result of applying the SELECT list expressions (including aggregates) to WW 3) Now we compute U = the result of applying the UNION/INTERSECT operators to S1...Sn O = the result of applying the ORDER BY clause to U Q = the result of applying the FETCH FIRST and OFFSET clauses to O
          Hide
          rhillegas Rick Hillegas added a comment -

          Now we can give meaning to Queries which join VTI parameters to other tables in the FROM list. This involves redefining the F, W, and S expressions described above.

          Note that VTI arguments can mention columns in tables, joined tables, views, and subqueries. However, VTI arguments may not mention columns in table functions or other VTIs.

          o Let V1...Vn be the VTIs which join to other elements in the FROM list.

          o Let C be the cartesian product of all other FROM list elements.

          o For each Vi, let Ki1...Kim be the columns in C which are mentioned by the arguments to Vi.

          o For each Vi, construct a temporary table Ti. The columns in Ti are all of the columns in Vi followed by additional columns Ki1...Kim. For each unique key combination (Ki1, ..., Kim) found in C, insert into Ti the results of evaluating Vi by plugging Ki1...Kim into its arguments.

          o For each Vij, add to the WHERE clause the conjunct (C.Kij = Ti.Kij). Call the resulting clause WHERE_2.

          o Finally, we construct these new values:

          F = the cartesian product of C with T1...Tn

          W = the result of applying WHERE_2 to F

          S = the result of applying the SELECT list to WW, throwing away all Ti.Kij columns.

          Show
          rhillegas Rick Hillegas added a comment - Now we can give meaning to Queries which join VTI parameters to other tables in the FROM list. This involves redefining the F, W, and S expressions described above. Note that VTI arguments can mention columns in tables, joined tables, views, and subqueries. However, VTI arguments may not mention columns in table functions or other VTIs. o Let V1...Vn be the VTIs which join to other elements in the FROM list. o Let C be the cartesian product of all other FROM list elements. o For each Vi, let Ki1...Kim be the columns in C which are mentioned by the arguments to Vi. o For each Vi, construct a temporary table Ti. The columns in Ti are all of the columns in Vi followed by additional columns Ki1...Kim. For each unique key combination (Ki1, ..., Kim) found in C, insert into Ti the results of evaluating Vi by plugging Ki1...Kim into its arguments. o For each Vij, add to the WHERE clause the conjunct (C.Kij = Ti.Kij). Call the resulting clause WHERE_2. o Finally, we construct these new values: F = the cartesian product of C with T1...Tn W = the result of applying WHERE_2 to F S = the result of applying the SELECT list to WW, throwing away all Ti.Kij columns.
          Hide
          rhillegas Rick Hillegas added a comment -

          Attaching the first rev of a release note for this issue.

          Show
          rhillegas Rick Hillegas added a comment - Attaching the first rev of a release note for this issue.
          Hide
          dagw Dag H. Wanvik added a comment -

          Thanks, Rick. My main concern was that the VT be placeable at any joinee position, your proposal handles that, so +1.

          Show
          dagw Dag H. Wanvik added a comment - Thanks, Rick. My main concern was that the VT be placeable at any joinee position, your proposal handles that, so +1.
          Hide
          bryanpendleton Bryan Pendleton added a comment -

          Rick, thanks for adding the release note.

          Do you think it would be useful to include the correct syntax for the (invalid) query given as an example in the release note?

          Something along the lines of:

          Applications which rely on these forbidden joins in the FROM list must be re-coded to use legal syntax. For example, the above query should be rewritten as: ...

          Show
          bryanpendleton Bryan Pendleton added a comment - Rick, thanks for adding the release note. Do you think it would be useful to include the correct syntax for the (invalid) query given as an example in the release note? Something along the lines of: Applications which rely on these forbidden joins in the FROM list must be re-coded to use legal syntax. For example, the above query should be rewritten as: ...
          Hide
          rhillegas Rick Hillegas added a comment -

          Thanks, Bryan. That is a good idea. Attaching a second rev of the release note. This presents a different bad query along with an example of how to re-write it so that it will compile.

          The new bad query joins a user-written table function to SPACE_TABLE in the FROM list. This is the only example I could imagine for a real query which might be affected by this change. Although joins to other SYSCS_DIAG tables and table functions are now forbidden, none of those joins really make any sense to begin with; it's hard to come up with substitute, re-written queries for queries which don't have any meaning!

          Thanks,
          -Rick

          Show
          rhillegas Rick Hillegas added a comment - Thanks, Bryan. That is a good idea. Attaching a second rev of the release note. This presents a different bad query along with an example of how to re-write it so that it will compile. The new bad query joins a user-written table function to SPACE_TABLE in the FROM list. This is the only example I could imagine for a real query which might be affected by this change. Although joins to other SYSCS_DIAG tables and table functions are now forbidden, none of those joins really make any sense to begin with; it's hard to come up with substitute, re-written queries for queries which don't have any meaning! Thanks, -Rick
          Hide
          bryanpendleton Bryan Pendleton added a comment -

          +1 to the revised release note.

          Show
          bryanpendleton Bryan Pendleton added a comment - +1 to the revised release note.
          Hide
          rhillegas Rick Hillegas added a comment -

          Attaching derby-5554-04-aa-forceVTIAfterJoin.diff. This is one approach to making Bryan's query work. Regression tests pass cleanly for me on this patch. I will also attach a second approach to fixing Bryan's query.

          It appears to me that this is what is going on:

          1) The problem query gives rise to a query plan with the following
          shape:

          SelectList
          /
          HashJoin
          / \
          NestedLoop sysconglomerates
          / \
          systables space_table

          2) At run-time, we are trying to initialize space_table from the NestedLoop row rather than from systables. The insertion of the extra HashJoin should have forced a renumbering of the result set number in the systables.tablename column reference. But it didn't.

          The derby-5554-04-aa-forceVTIAfterJoin.diff patch forces the VTI after all of the base tables in the join order. The VTI is then initialized from a row which has been filled in.

          I am not thrilled with this fix. It fiddles with the optimizer in order to work around a code-generation problem. However, I am attaching this approach as a fallback in case we find a fatal problem with a second approach which I will attach presently. The second approach is more complicated.

          Touches the following files:

          ------------

          M java/engine/org/apache/derby/impl/sql/compile/FromVTI.java

          If a VTI argument refers to another element in the FROM list, then force the the VTI to follow all non-VTIs in the join order.

          ------------

          M java/testing/org/apache/derbyTesting/functionTests/tests/lang/SysDiagVTIMappingTest.java

          Add Bryan's query as a test case.

          Show
          rhillegas Rick Hillegas added a comment - Attaching derby-5554-04-aa-forceVTIAfterJoin.diff. This is one approach to making Bryan's query work. Regression tests pass cleanly for me on this patch. I will also attach a second approach to fixing Bryan's query. It appears to me that this is what is going on: 1) The problem query gives rise to a query plan with the following shape: SelectList / HashJoin / \ NestedLoop sysconglomerates / \ systables space_table 2) At run-time, we are trying to initialize space_table from the NestedLoop row rather than from systables. The insertion of the extra HashJoin should have forced a renumbering of the result set number in the systables.tablename column reference. But it didn't. The derby-5554-04-aa-forceVTIAfterJoin.diff patch forces the VTI after all of the base tables in the join order. The VTI is then initialized from a row which has been filled in. I am not thrilled with this fix. It fiddles with the optimizer in order to work around a code-generation problem. However, I am attaching this approach as a fallback in case we find a fatal problem with a second approach which I will attach presently. The second approach is more complicated. Touches the following files: ------------ M java/engine/org/apache/derby/impl/sql/compile/FromVTI.java If a VTI argument refers to another element in the FROM list, then force the the VTI to follow all non-VTIs in the join order. ------------ M java/testing/org/apache/derbyTesting/functionTests/tests/lang/SysDiagVTIMappingTest.java Add Bryan's query as a test case.
          Hide
          rhillegas Rick Hillegas added a comment -

          Attaching derby-5554-04-ab-remapVTIarg.diff, a second approach to fixing Bryan's query. I am running regression tests now.

          The second approach code generates a reference to the result column in the SYSTABLES base table rather than to a column in a node above it. I am not thrilled by the complexity of this solution and I think that it may not be as general as the first approach of forcing the VTI to an inner slot in the join order. For instance, it may be possible to cook up elaborate queries which still fail because the VTI joins to the output of joined tables or subqueries in the FROM list.

          On balance, I think this is a better approach because it fixes a code generator problem in the code generator. I'm reluctant to spend more time coming up with a better fix since we want to discourage use of this kind of join.

          The fix is to have the base tables tell the VTI where their rows are. I could not find an easy way to dig up this information in the existing context available to the code generator. So I added a new method and piece of state to FromVTI for tracking this information.

          Touches the following files:

          -----------

          M java/engine/org/apache/derby/impl/sql/compile/FromVTI.java
          M java/engine/org/apache/derby/impl/sql/compile/FromBaseTable.java

          Base tables give the location of their result rows to FromVTIs at code generation time.

          -----------

          M java/testing/org/apache/derbyTesting/functionTests/tests/lang/SysDiagVTIMappingTest.java

          Adds Bryan's query as a test case.

          Show
          rhillegas Rick Hillegas added a comment - Attaching derby-5554-04-ab-remapVTIarg.diff, a second approach to fixing Bryan's query. I am running regression tests now. The second approach code generates a reference to the result column in the SYSTABLES base table rather than to a column in a node above it. I am not thrilled by the complexity of this solution and I think that it may not be as general as the first approach of forcing the VTI to an inner slot in the join order. For instance, it may be possible to cook up elaborate queries which still fail because the VTI joins to the output of joined tables or subqueries in the FROM list. On balance, I think this is a better approach because it fixes a code generator problem in the code generator. I'm reluctant to spend more time coming up with a better fix since we want to discourage use of this kind of join. The fix is to have the base tables tell the VTI where their rows are. I could not find an easy way to dig up this information in the existing context available to the code generator. So I added a new method and piece of state to FromVTI for tracking this information. Touches the following files: ----------- M java/engine/org/apache/derby/impl/sql/compile/FromVTI.java M java/engine/org/apache/derby/impl/sql/compile/FromBaseTable.java Base tables give the location of their result rows to FromVTIs at code generation time. ----------- M java/testing/org/apache/derbyTesting/functionTests/tests/lang/SysDiagVTIMappingTest.java Adds Bryan's query as a test case.
          Hide
          rhillegas Rick Hillegas added a comment -

          Tests passed cleanly for me on derby-5554-04-ab-remapVTIarg.diff.

          Show
          rhillegas Rick Hillegas added a comment - Tests passed cleanly for me on derby-5554-04-ab-remapVTIarg.diff.
          Hide
          knutanders Knut Anders Hatlen added a comment -

          If I understand the description of the 04-ab patch, it should not be dependent on the join order. That makes me wonder from which row in the base table it reads the value that it passes as an argument to the VTI, if the VTI happens to be the outer table in the join. Will the VTI be executed just once, with the value taken from the first row in the base table? I tried to test it by forcing the join order with an optimizer override, but then I just got "ERROR 42Y70: The user specified an illegal join order. This could be caused by a join column from an inner table being passed as a parameter to an external virtual table." So it may be that the optimizer knows not to use the VTI as the outer table, but it's not entirely obvious to me.

          I tried a variant of Bryan's original query, just with a different ordering of the FROM list, and I'm afraid I still see a NullPointerException with the 04-ab patch:

          ij> create table t1(x int);
          0 rows inserted/updated/deleted
          ij> create table t2(y int);
          0 rows inserted/updated/deleted
          ij> SELECT T2., systabs., syscgs.conglomeratenumber
          FROM
          sys.sysconglomerates syscgs,
          TABLE (SYSCS_DIAG.SPACE_TABLE(systabs.tablename)) AS T2,
          SYS.SYSTABLES systabs
          WHERE systabs.tabletype = 'T' and systabs.tableid = syscgs.tableid;
          ERROR 38000: The exception 'java.lang.NullPointerException' was thrown while evaluating an expression. (errorCode = 20000)
          ERROR XJ001: Java exception: ': java.lang.NullPointerException'. (errorCode = 20000)

          Show
          knutanders Knut Anders Hatlen added a comment - If I understand the description of the 04-ab patch, it should not be dependent on the join order. That makes me wonder from which row in the base table it reads the value that it passes as an argument to the VTI, if the VTI happens to be the outer table in the join. Will the VTI be executed just once, with the value taken from the first row in the base table? I tried to test it by forcing the join order with an optimizer override, but then I just got "ERROR 42Y70: The user specified an illegal join order. This could be caused by a join column from an inner table being passed as a parameter to an external virtual table." So it may be that the optimizer knows not to use the VTI as the outer table, but it's not entirely obvious to me. I tried a variant of Bryan's original query, just with a different ordering of the FROM list, and I'm afraid I still see a NullPointerException with the 04-ab patch: ij> create table t1(x int); 0 rows inserted/updated/deleted ij> create table t2(y int); 0 rows inserted/updated/deleted ij> SELECT T2. , systabs. , syscgs.conglomeratenumber FROM sys.sysconglomerates syscgs, TABLE (SYSCS_DIAG.SPACE_TABLE(systabs.tablename)) AS T2, SYS.SYSTABLES systabs WHERE systabs.tabletype = 'T' and systabs.tableid = syscgs.tableid; ERROR 38000: The exception 'java.lang.NullPointerException' was thrown while evaluating an expression. (errorCode = 20000) ERROR XJ001: Java exception: ': java.lang.NullPointerException'. (errorCode = 20000)
          Hide
          rhillegas Rick Hillegas added a comment -

          Thanks for test-driving the patch, Knut. The same query also raises an NPE with the first patch.

          Show
          rhillegas Rick Hillegas added a comment - Thanks for test-driving the patch, Knut. The same query also raises an NPE with the first patch.
          Hide
          rhillegas Rick Hillegas added a comment -

          Attaching derby-5554-04-ac-remapVTIarg.diff, a new rev of the argument-remapping approach. I am running regression tests now.

          This rev is simpler than the previous attempt. It may also handle more cases than the previous patch. The patch merely touches FromVTI.

          Touches the following files:

          -----------

          M java/engine/org/apache/derby/impl/sql/compile/FromVTI.java

          1) At bind-time, we save a reference to every FROM list element referenced by the VTI's arguments.

          2) At code-generation time, we build an argument expression from the ResultColumn actually used when the FROM list element was code generated.

          -----------

          M java/testing/org/apache/derbyTesting/functionTests/tests/lang/SysDiagVTIMappingTest.java

          Adds Bryan's original query and Knut's re-ordered query as test cases.

          Show
          rhillegas Rick Hillegas added a comment - Attaching derby-5554-04-ac-remapVTIarg.diff, a new rev of the argument-remapping approach. I am running regression tests now. This rev is simpler than the previous attempt. It may also handle more cases than the previous patch. The patch merely touches FromVTI. Touches the following files: ----------- M java/engine/org/apache/derby/impl/sql/compile/FromVTI.java 1) At bind-time, we save a reference to every FROM list element referenced by the VTI's arguments. 2) At code-generation time, we build an argument expression from the ResultColumn actually used when the FROM list element was code generated. ----------- M java/testing/org/apache/derbyTesting/functionTests/tests/lang/SysDiagVTIMappingTest.java Adds Bryan's original query and Knut's re-ordered query as test cases.
          Hide
          knutanders Knut Anders Hatlen added a comment -

          Thanks, Rick. That seems to fix all possible orderings of the from list in that query. However, I still see a NPE if I use the JOIN keyword:

          ij> connect 'jdbc:derby:memory:db;create=true';
          ij> create table t1(x int);
          0 rows inserted/updated/deleted
          ij> create table t2(y int);
          0 rows inserted/updated/deleted
          ij> select tt.* from table(syscs_diag.space_table(st.tablename)) tt join sys.systables st using(tableid);
          ERROR 38000: The exception 'java.lang.NullPointerException' was thrown while evaluating an expression.
          ERROR XJ001: Java exception: ': java.lang.NullPointerException'.

          Perhaps related, if I change the join to an outer join (doesn't matter if it's right or left join), it fails with an assert failure (or NPE with insane jars).

          ij> select tt.* from table(syscs_diag.space_table(st.tablename)) tt right join sys.systables st using(tableid);
          ERROR XJ001: Java exception: 'ASSERT FAILED costEstimate is not expected to be null for org.apache.derby.impl.sql.compile.FromVTI: org.apache.derby.shared.common.sanity.AssertFailure'.

          I do see these errors also without your patch.

          Show
          knutanders Knut Anders Hatlen added a comment - Thanks, Rick. That seems to fix all possible orderings of the from list in that query. However, I still see a NPE if I use the JOIN keyword: ij> connect 'jdbc:derby:memory:db;create=true'; ij> create table t1(x int); 0 rows inserted/updated/deleted ij> create table t2(y int); 0 rows inserted/updated/deleted ij> select tt.* from table(syscs_diag.space_table(st.tablename)) tt join sys.systables st using(tableid); ERROR 38000: The exception 'java.lang.NullPointerException' was thrown while evaluating an expression. ERROR XJ001: Java exception: ': java.lang.NullPointerException'. Perhaps related, if I change the join to an outer join (doesn't matter if it's right or left join), it fails with an assert failure (or NPE with insane jars). ij> select tt.* from table(syscs_diag.space_table(st.tablename)) tt right join sys.systables st using(tableid); ERROR XJ001: Java exception: 'ASSERT FAILED costEstimate is not expected to be null for org.apache.derby.impl.sql.compile.FromVTI: org.apache.derby.shared.common.sanity.AssertFailure'. I do see these errors also without your patch.
          Hide
          rhillegas Rick Hillegas added a comment -

          Thanks for test-driving the derby-5554-04-ac-remapVTIarg.diff patch, Knut. My instinct is to limit support for this non-standard syntax to the cases documented in the user guides or used by our internal metadata queries. The additional cases you have discovered are probably not being stressed in the wild--at least no-one has filed a bug report for them. We could probably extrapolate a meaning for these additional queries, based on the meaning which we have given to VTI-joins in the FROM list, but I am reluctant to extend the applicability of this Derby extension unless we have to.

          What do you think about this plan:

          1) Commit derby-5554-04-ac-remapVTIarg.diff once we're convinced that it works for our metadata queries and for the joins documented in our user guides.

          2) Prepare another patch to raise an error when compiling the additional query families you have discovered.

          Thanks,
          -Rick

          Show
          rhillegas Rick Hillegas added a comment - Thanks for test-driving the derby-5554-04-ac-remapVTIarg.diff patch, Knut. My instinct is to limit support for this non-standard syntax to the cases documented in the user guides or used by our internal metadata queries. The additional cases you have discovered are probably not being stressed in the wild--at least no-one has filed a bug report for them. We could probably extrapolate a meaning for these additional queries, based on the meaning which we have given to VTI-joins in the FROM list, but I am reluctant to extend the applicability of this Derby extension unless we have to. What do you think about this plan: 1) Commit derby-5554-04-ac-remapVTIarg.diff once we're convinced that it works for our metadata queries and for the joins documented in our user guides. 2) Prepare another patch to raise an error when compiling the additional query families you have discovered. Thanks, -Rick
          Hide
          rhillegas Rick Hillegas added a comment -

          In any event, I think that we need to file a new JIRA to forbid the use of table functions in the additional query families which Knut identified. I see the same errors when I use table functions instead of diagnostic VTIs. The following script shows this:

          connect 'jdbc:derby:memory:db;create=true';

          create function lowerCaseRow( contents varchar( 32672 ) )
          returns table
          (
          contents varchar( 32672 )
          )
          language java parameter style DERBY_JDBC_RESULT_SET no sql
          external name 'org.apache.derbyTesting.functionTests.tests.lang.TableFunctionTest.lowerCaseRow';

          select tt.* from table( lowerCaseRow(st.tablename)) tt join sys.systables st on tt.contents = st.tablename;

          select tt.* from table( lowerCaseRow(st.tablename)) tt right join sys.systables st on tt.contents = st.tablename;

          Show
          rhillegas Rick Hillegas added a comment - In any event, I think that we need to file a new JIRA to forbid the use of table functions in the additional query families which Knut identified. I see the same errors when I use table functions instead of diagnostic VTIs. The following script shows this: connect 'jdbc:derby:memory:db;create=true'; create function lowerCaseRow( contents varchar( 32672 ) ) returns table ( contents varchar( 32672 ) ) language java parameter style DERBY_JDBC_RESULT_SET no sql external name 'org.apache.derbyTesting.functionTests.tests.lang.TableFunctionTest.lowerCaseRow'; select tt.* from table( lowerCaseRow(st.tablename)) tt join sys.systables st on tt.contents = st.tablename; select tt.* from table( lowerCaseRow(st.tablename)) tt right join sys.systables st on tt.contents = st.tablename;
          Hide
          kmarsden Kathey Marsden added a comment -

          For table functions, should those queries be rejected according to the standard?

          Show
          kmarsden Kathey Marsden added a comment - For table functions, should those queries be rejected according to the standard?
          Hide
          rhillegas Rick Hillegas added a comment -

          Hi Kathey,

          Yes, these queries are another violation of part 2 of the SQL Standard, section 7.6 (<table reference>), Syntax Rule 6.a. I have re-opened DERBY-5779 to continue the work of rejecting these violations. Thanks.

          Show
          rhillegas Rick Hillegas added a comment - Hi Kathey, Yes, these queries are another violation of part 2 of the SQL Standard, section 7.6 (<table reference>), Syntax Rule 6.a. I have re-opened DERBY-5779 to continue the work of rejecting these violations. Thanks.
          Hide
          rhillegas Rick Hillegas added a comment -

          Tests passed cleanly for me on derby-5554-04-ac-remapVTIarg.diff. As an incremental improvement, I have committed the patch at subversion revision 1360306.

          Show
          rhillegas Rick Hillegas added a comment - Tests passed cleanly for me on derby-5554-04-ac-remapVTIarg.diff. As an incremental improvement, I have committed the patch at subversion revision 1360306.
          Hide
          rhillegas Rick Hillegas added a comment -

          Resolving this issue because the work on DERBY-5779 prevents the additional bad syntax. We can re-open this issue if we discover other bad syntax.

          Show
          rhillegas Rick Hillegas added a comment - Resolving this issue because the work on DERBY-5779 prevents the additional bad syntax. We can re-open this issue if we discover other bad syntax.
          Hide
          kmarsden Kathey Marsden added a comment -

          Marking derby_backport_reject_10_8 as this issue required a release note and restricted SQL syntax. Doesn't look appropriate for backport.

          Show
          kmarsden Kathey Marsden added a comment - Marking derby_backport_reject_10_8 as this issue required a release note and restricted SQL syntax. Doesn't look appropriate for backport.

            People

            • Assignee:
              rhillegas Rick Hillegas
              Reporter:
              bryanpendleton Bryan Pendleton
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development