Derby
  1. Derby
  2. DERBY-5954

NPE in SELECT involving subselects and windows functions

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.6.1.0, 10.6.2.1, 10.7.1.1, 10.8.1.2, 10.8.2.2, 10.9.1.0
    • Fix Version/s: 10.8.3.3, 10.9.2.2, 10.10.1.1
    • Component/s: SQL
    • Labels:
      None
    • Urgency:
      Normal
    • Issue & fix info:
      Repro attached
    • Bug behavior facts:
      Regression, Seen in production

      Description

      A user reports and I have verified an NPE on the following SELECT:

      connect 'jdbc:derby:memory:db;create=true';

      create table blah ( a int );
      insert into blah values (1), (2), (3), (4), (5), (6), (7);

      SELECT rn, (SELECT rn FROM (SELECT row_number() over() rn FROM blah ) as T2
      where T2.rn = T1.rn+1) rn2
      FROM (SELECT row_number() over() rn from blah) as T1;

      1. derby-5954.diff
        0.6 kB
        Dag H. Wanvik
      2. derby-5954-with-test.stat
        0.2 kB
        Dag H. Wanvik
      3. derby-5954-with-test.diff
        2 kB
        Dag H. Wanvik
      4. derby-5954-with-test-2.diff
        2 kB
        Dag H. Wanvik
      5. derby-5954-with-test-2.stat
        0.2 kB
        Dag H. Wanvik

        Issue Links

          Activity

          Hide
          Rick Hillegas added a comment -

          The user reports:

          "NOTE:

          I have tested this with Derby 10.5, 10.8 and 10.9

          The queries work with Derby 10.5 ; but fail with 10.8 and 10.9"

          Show
          Rick Hillegas added a comment - The user reports: "NOTE: I have tested this with Derby 10.5, 10.8 and 10.9 The queries work with Derby 10.5 ; but fail with 10.8 and 10.9"
          Show
          Myrna van Lunteren added a comment - This was on the following thread: http://old.nabble.com/Nested-SELECT-in-selected-columns-list-causes-NullPointerException-if-it-references-ROW_NUMBER%28%29-OVER%28%29-function-td34567925.html
          Hide
          Dag H. Wanvik added a comment - - edited

          This query with only one row_number causes an assert with the debug build:

          SELECT rn_t1, (
          SELECT rn_t2 FROM (
          SELECT row_number() over() as rn_t2 FROM derby5954)
          as T2
          where T2.rn_t2 = T1.rn_t1)
          as rn_outer
          FROM (SELECT a as rn_t1 from derby5954) as T1")

          Caused by: org.apache.derby.shared.common.sanity.AssertFailure: ASSERT FAILED sourceResultSetNumber expected to be >= 0 for virtual column RN_T2
          at org.apache.derby.shared.common.sanity.SanityManager.ASSERT(SanityManager.java:120)
          at org.apache.derby.impl.sql.compile.VirtualColumnNode.generateExpression(VirtualColumnNode.java:231)
          at org.apache.derby.impl.sql.compile.ResultColumn.generateExpression(ResultColumn.java:1059)
          at org.apache.derby.impl.sql.compile.ResultColumnList.generateCore(ResultColumnList.java:1426)
          at org.apache.derby.impl.sql.compile.ProjectRestrictNode.generateMinion(ProjectRestrictNode.java:1557)
          at org.apache.derby.impl.sql.compile.ProjectRestrictNode.generate(ProjectRestrictNode.java:1337)
          at org.apache.derby.impl.sql.compile.WindowResultSetNode.generate(WindowResultSetNode.java:409)
          at org.apache.derby.impl.sql.compile.ProjectRestrictNode.generateMinion(ProjectRestrictNode.java:1482)
          at org.apache.derby.impl.sql.compile.ProjectRestrictNode.generate(ProjectRestrictNode.java:1337)
          at org.apache.derby.impl.sql.compile.ScrollInsensitiveResultSetNode.generate(ScrollInsensitiveResultSetNode.java:109)
          at org.apache.derby.impl.sql.compile.CursorNode.generate(CursorNode.java:641)

          Interestingly, a similar query without row_number, gives a wrong result; i think:

          SELECT a_t1, (
          SELECT a_t2 FROM (
          SELECT a as a_t2 FROM derby5954)
          as T2
          where T2.a_t2 = T1.a_t1 + 1 )
          as a_outer
          FROM (SELECT a as a_t1 from derby5954) as T1

          gives:

          1 2
          2 3
          3 4
          4 5
          5 6
          6 7
          7 0

          In the last row, for the value of a=7, I think the scalar subquery in the select list should give "null", not 0.

          Update: The last above is misleading pilot error, the 0 comes from ResultSet#getInt, which will return 0 if the result column is NULL. So no error there.

          Show
          Dag H. Wanvik added a comment - - edited This query with only one row_number causes an assert with the debug build: SELECT rn_t1, ( SELECT rn_t2 FROM ( SELECT row_number() over() as rn_t2 FROM derby5954) as T2 where T2.rn_t2 = T1.rn_t1) as rn_outer FROM (SELECT a as rn_t1 from derby5954) as T1") Caused by: org.apache.derby.shared.common.sanity.AssertFailure: ASSERT FAILED sourceResultSetNumber expected to be >= 0 for virtual column RN_T2 at org.apache.derby.shared.common.sanity.SanityManager.ASSERT(SanityManager.java:120) at org.apache.derby.impl.sql.compile.VirtualColumnNode.generateExpression(VirtualColumnNode.java:231) at org.apache.derby.impl.sql.compile.ResultColumn.generateExpression(ResultColumn.java:1059) at org.apache.derby.impl.sql.compile.ResultColumnList.generateCore(ResultColumnList.java:1426) at org.apache.derby.impl.sql.compile.ProjectRestrictNode.generateMinion(ProjectRestrictNode.java:1557) at org.apache.derby.impl.sql.compile.ProjectRestrictNode.generate(ProjectRestrictNode.java:1337) at org.apache.derby.impl.sql.compile.WindowResultSetNode.generate(WindowResultSetNode.java:409) at org.apache.derby.impl.sql.compile.ProjectRestrictNode.generateMinion(ProjectRestrictNode.java:1482) at org.apache.derby.impl.sql.compile.ProjectRestrictNode.generate(ProjectRestrictNode.java:1337) at org.apache.derby.impl.sql.compile.ScrollInsensitiveResultSetNode.generate(ScrollInsensitiveResultSetNode.java:109) at org.apache.derby.impl.sql.compile.CursorNode.generate(CursorNode.java:641) Interestingly, a similar query without row_number, gives a wrong result; i think: SELECT a_t1, ( SELECT a_t2 FROM ( SELECT a as a_t2 FROM derby5954) as T2 where T2.a_t2 = T1.a_t1 + 1 ) as a_outer FROM (SELECT a as a_t1 from derby5954) as T1 gives: 1 2 2 3 3 4 4 5 5 6 6 7 7 0 In the last row, for the value of a=7, I think the scalar subquery in the select list should give "null", not 0. Update: The last above is misleading pilot error, the 0 comes from ResultSet#getInt, which will return 0 if the result column is NULL. So no error there.
          Hide
          Dag H. Wanvik added a comment - - edited

          Uploading patch derby-5954, which is a simple one-line patch that makes the row_number variants work, i.e.
          it will work as "well" as the table select with similar nesting/correlated query above.

          The issue was the following: when looking for in-line window definitions in the select list, the visitor looked to deep: it should not move into subqueries in the select list. This error caused the query tree to become seriously garbled, hence the NPE and the assert seen.

          Still need to add a JUnit test, so not for commit yet.

          Show
          Dag H. Wanvik added a comment - - edited Uploading patch derby-5954, which is a simple one-line patch that makes the row_number variants work, i.e. it will work as "well" as the table select with similar nesting/correlated query above. The issue was the following: when looking for in-line window definitions in the select list, the visitor looked to deep: it should not move into subqueries in the select list. This error caused the query tree to become seriously garbled, hence the NPE and the assert seen. Still need to add a JUnit test, so not for commit yet.
          Hide
          Dag H. Wanvik added a comment -

          Uploading the patch extended with a test case addition to OLAPTest.
          Running regression, please review.

          Show
          Dag H. Wanvik added a comment - Uploading the patch extended with a test case addition to OLAPTest. Running regression, please review.
          Hide
          Dag H. Wanvik added a comment -

          Regressions ran ok.

          Show
          Dag H. Wanvik added a comment - Regressions ran ok.
          Hide
          Bryan Pendleton added a comment -

          Thanks for working on this, Dag!

          The patch looks clear and straightforward.

          Just thinking abstractly, it seems like there are two different ways that the new
          visitor code could be wrong:
          1) It could fail to descend into a sub-query when in fact it should, or
          2) It could descend into a sub-query that it shouldn't descend into.

          Trying to think more concretely, I wonder about the literal use of SelectNode
          as the argument to the Visitor instance.

          What if, for example, there is a UNION in there; for example, what if we saw:

          SELECT rn_t1, (
          SELECT rn_t2 FROM (
          SELECT row_number() over() as rn_t2 FROM derby5954
          UNION
          SELECT somecolumn FROM sometable)
          as T2
          where T2.rn_t2 = T1.rn_t1)
          as rn_outer
          FROM (SELECT a as rn_t1 from derby5954) as T1")

          I'm not even sure that what I'm typing is syntactically valid; I'm just trying
          to inspire some other test cases that might be revealing.

          Show
          Bryan Pendleton added a comment - Thanks for working on this, Dag! The patch looks clear and straightforward. Just thinking abstractly, it seems like there are two different ways that the new visitor code could be wrong: 1) It could fail to descend into a sub-query when in fact it should, or 2) It could descend into a sub-query that it shouldn't descend into. Trying to think more concretely, I wonder about the literal use of SelectNode as the argument to the Visitor instance. What if, for example, there is a UNION in there; for example, what if we saw: SELECT rn_t1, ( SELECT rn_t2 FROM ( SELECT row_number() over() as rn_t2 FROM derby5954 UNION SELECT somecolumn FROM sometable) as T2 where T2.rn_t2 = T1.rn_t1) as rn_outer FROM (SELECT a as rn_t1 from derby5954) as T1") I'm not even sure that what I'm typing is syntactically valid; I'm just trying to inspire some other test cases that might be revealing.
          Hide
          Bryan Pendleton added a comment -

          To be clear: I have no objections to moving forward with the proposed patch

          We could continue to try to add other ROW_NUMBER tests and window tests
          as part of separate efforts in separate JIRA issues.

          Show
          Bryan Pendleton added a comment - To be clear: I have no objections to moving forward with the proposed patch We could continue to try to add other ROW_NUMBER tests and window tests as part of separate efforts in separate JIRA issues.
          Hide
          Kathey Marsden added a comment -

          I would really like this fix in 10.8.3 and think it would be worthwhile to hold up the release up to another week as this is a regression and a fix needed by a user in 10.8 but Bryan's comments seem serious:

          "Just thinking abstractly, it seems like there are two different ways that the new
          visitor code could be wrong:
          1) It could fail to descend into a sub-query when in fact it should, or
          2) It could descend into a sub-query that it shouldn't descend into."

          Bryan, do you think the patch might potentially cause a different regression? Dag, what are your thoughts on this?

          Thanks

          Kathey

          Show
          Kathey Marsden added a comment - I would really like this fix in 10.8.3 and think it would be worthwhile to hold up the release up to another week as this is a regression and a fix needed by a user in 10.8 but Bryan's comments seem serious: "Just thinking abstractly, it seems like there are two different ways that the new visitor code could be wrong: 1) It could fail to descend into a sub-query when in fact it should, or 2) It could descend into a sub-query that it shouldn't descend into." Bryan, do you think the patch might potentially cause a different regression? Dag, what are your thoughts on this? Thanks Kathey
          Hide
          Dag H. Wanvik added a comment - - edited

          I'll try to explain why I believe the changed code is correct. Sorry for answering late, I have been out for a while.
          What we are looking for here is any "<in-line window specification>" in the sense of SQL 2003 section 6.10 "Window Function" contained in a select list (syntax rule 3b). The meaning of the in-line window specification is described in Syntax Rule 7-9 of section 6.10.
          Pertinent quote (simplified):

          7) Let SL be the <select list> that simply contains[1] OF (the window function)
          8d) Let SLNEW be the <select list> that is obtained from SL by replacing OF by:
          OFT OVER WSN (where WSN is an equivalent synthesized window on the table expression TE)
          Rule 9 f) and g) describes how our in-lined window is transformed to be appended to any existing windows for the table expression.

          Now, earlier, we found window functions not simply contained in the SL, i.e. inside a nested SELECT since the visitor went to deep. The in-lined window specification properly belonged at the level of that nested SELECT's table expression.

          So, Bryan's Criterion 1) is satisfied, in that the visitor previously visited all nodes contained in the select list.
          After the change we do not visit nodes under a simply contained (nested) SELECT statement. But that is OK, since any OF previously (erroneously) found there are not *simply contained in the outer SL.

          So 1) holds I believe. Now, as for 2) could we (still) be descending into any subquery we should not (we obviously did before)?
          That would need to be in a case where we had some legal syntax in the select list (SL) containing in-lined window specification not inside a SELECT node. Can that happen? Well, besides SELECT we can have a VALUES subquery in a SELECT list. But we forbid window functions inside a VALUES subquery, cf. this line in RowResultSetNode#bindExpressions:

          SelectNode.checkNoWindowFunctions(resultColumns, "VALUES");

          So, since we now forbid the visitor to descend into nested SELECTs and any nested VALUES are innocuous, we can not find any "wrong" in-lined window function. So 2) also holds.

          [1] A1 simply contains B1 if A1 contains B1 without an intervening instance of <A> or an intervening instance of
          <B>.

          Makes sense?

          Show
          Dag H. Wanvik added a comment - - edited I'll try to explain why I believe the changed code is correct. Sorry for answering late, I have been out for a while. What we are looking for here is any "<in-line window specification>" in the sense of SQL 2003 section 6.10 "Window Function" contained in a select list (syntax rule 3b). The meaning of the in-line window specification is described in Syntax Rule 7-9 of section 6.10. Pertinent quote (simplified): 7) Let SL be the <select list> that simply contains [1] OF (the window function) 8d) Let SLNEW be the <select list> that is obtained from SL by replacing OF by: OFT OVER WSN (where WSN is an equivalent synthesized window on the table expression TE) Rule 9 f) and g) describes how our in-lined window is transformed to be appended to any existing windows for the table expression. Now, earlier, we found window functions not simply contained in the SL, i.e. inside a nested SELECT since the visitor went to deep. The in-lined window specification properly belonged at the level of that nested SELECT's table expression. So, Bryan's Criterion 1) is satisfied, in that the visitor previously visited all nodes contained in the select list. After the change we do not visit nodes under a simply contained (nested) SELECT statement. But that is OK, since any OF previously (erroneously) found there are not *simply contained in the outer SL. So 1) holds I believe. Now, as for 2) could we (still) be descending into any subquery we should not (we obviously did before)? That would need to be in a case where we had some legal syntax in the select list (SL) containing in-lined window specification not inside a SELECT node. Can that happen? Well, besides SELECT we can have a VALUES subquery in a SELECT list. But we forbid window functions inside a VALUES subquery, cf. this line in RowResultSetNode#bindExpressions: SelectNode.checkNoWindowFunctions(resultColumns, "VALUES"); So, since we now forbid the visitor to descend into nested SELECTs and any nested VALUES are innocuous, we can not find any "wrong" in-lined window function. So 2) also holds. [1] A1 simply contains B1 if A1 contains B1 without an intervening instance of <A> or an intervening instance of <B>. Makes sense?
          Hide
          Dag H. Wanvik added a comment -

          Uploading "derby-5954-with-test-2", which improves on the comment next to the change.

          Show
          Dag H. Wanvik added a comment - Uploading "derby-5954-with-test-2", which improves on the comment next to the change.
          Hide
          Dag H. Wanvik added a comment -

          Committed patch "derby-5954-with-test-2" as svn 1406240, resolving. I plan to backport this change so please do not close this issue yet.

          Show
          Dag H. Wanvik added a comment - Committed patch "derby-5954-with-test-2" as svn 1406240, resolving. I plan to backport this change so please do not close this issue yet.
          Hide
          Dag H. Wanvik added a comment -

          Backported to 10.9 as svn 1406247 and to 10.8 as svn 1406254. It should be possible to backport this all the way to 10.6, but I stop at 10.8 now. Feel free to close, Rick.

          Show
          Dag H. Wanvik added a comment - Backported to 10.9 as svn 1406247 and to 10.8 as svn 1406254. It should be possible to backport this all the way to 10.6, but I stop at 10.8 now. Feel free to close, Rick.
          Hide
          Rick Hillegas added a comment -

          Closing. Thanks, Dag.

          Show
          Rick Hillegas added a comment - Closing. Thanks, Dag.
          Hide
          Dag H. Wanvik added a comment -

          Opps, forgot to set all Fixed in items, reopening to do that

          Show
          Dag H. Wanvik added a comment - Opps, forgot to set all Fixed in items, reopening to do that
          Hide
          Bryan Pendleton added a comment -

          Dag, thanks for working through the reasoning with us. It helps me to understand your
          thinking, and describing it as you did will also help future students of the code who
          are trying to understand its behavior. It is much appreciated!

          Show
          Bryan Pendleton added a comment - Dag, thanks for working through the reasoning with us. It helps me to understand your thinking, and describing it as you did will also help future students of the code who are trying to understand its behavior. It is much appreciated!

            People

            • Assignee:
              Dag H. Wanvik
              Reporter:
              Rick Hillegas
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development