Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.5.1.1, 10.5.2.0, 10.5.3.0, 10.6.1.0
    • Fix Version/s: 10.5.3.2, 10.6.2.1, 10.7.1.1
    • Component/s: SQL
    • Labels:
      None
    • Bug behavior facts:
      Regression

      Description

      Using the schema from DERBY-4712 and running the original randomized query generator used to find DERBY-4712, http://code.google.com/p/h2database/source/browse/trunk/h2/src/test/org/h2/test/db/TestNestedJoins.java, I have uncovered yet another bug (NPE), which appears to be a different beast. This one is a regression in 10.5 (works in 10.4 and older). This is the failing query:

      SELECT t0.x0,
      t1.x1,
      t2.x2,
      t3.x3,
      t4.x4,
      t5.x5,
      t6.x6,
      t7.x7
      FROM ((t0
      LEFT OUTER JOIN ((t1
      LEFT OUTER JOIN (t2
      LEFT OUTER JOIN t3
      ON t2.x2 = t3.x3 )
      ON t1.x1 = t2.x2 )
      LEFT OUTER JOIN (t4
      INNER JOIN (t5
      LEFT OUTER JOIN t6
      ON t5.x5 = t6.x6 )
      ON t4.x4 = t5.x5 )
      ON t1.x1 = t5.x5 )
      ON t0.x0 = t5.x5 )
      LEFT OUTER JOIN t7
      ON t3.x3 = t7.x7 );

      Relevant part of the stack trace (using 10.5 trunk @ svn 995846):

      Caused by: java.lang.NullPointerException
      at org.apache.derby.impl.sql.execute.BaseActivation.getColumnFromRow(BaseActivation.java:1458)
      at org.apache.derby.exe.ac4ac48095x012axfc73x9c5dx000003d485d847.e19(Unknown Source)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.derby.impl.services.reflect.ReflectMethod.invoke(ReflectMethod.java:46)
      at org.apache.derby.impl.sql.execute.GenericQualifier.getOrderable(GenericQualifier.java:96)
      at org.apache.derby.impl.sql.execute.NoPutResultSetImpl.clearOrderableCache(NoPutResultSetImpl.java:307)
      at org.apache.derby.impl.sql.execute.HashScanResultSet.resetProbeVariables(HashScanResultSet.java:359)
      at org.apache.derby.impl.sql.execute.HashScanResultSet.openCore(HashScanResultSet.java:322)
      at org.apache.derby.impl.sql.execute.JoinResultSet.openRight(JoinResultSet.java:283)
      at org.apache.derby.impl.sql.execute.JoinResultSet.openCore(JoinResultSet.java:152)
      at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.openCore(ProjectRestrictResultSet.java:181)
      at org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.open(BasicNoPutResultSetImpl.java:251)
      at org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(GenericPreparedStatement.java:416)

      1. DERBY-4798_10_5_diff.txt
        8 kB
        Kathey Marsden
      2. derby-4798a.stat
        0.2 kB
        Dag H. Wanvik
      3. derby-4798a.diff
        8 kB
        Dag H. Wanvik
      4. derby-4798-generated-code.txt
        44 kB
        Dag H. Wanvik
      5. derby-4798-optimized-tree.txt
        167 kB
        Dag H. Wanvik
      6. derby4798-whence-rs-5-and-execution-plan.txt
        19 kB
        Dag H. Wanvik
      7. OuterJoinTest.diff
        7 kB
        Dag H. Wanvik
      8. experiment.diff
        1 kB
        Bryan Pendleton

        Issue Links

          Activity

          Hide
          Dag H. Wanvik added a comment -

          Linking to DERBY-4712, since discovered there and relying on that schema.

          Show
          Dag H. Wanvik added a comment - Linking to DERBY-4712 , since discovered there and relying on that schema.
          Hide
          Knut Anders Hatlen added a comment -

          The NPE in this issue first occurred after this commit (might not be the root cause, though, since the comments in that issue indicate that the original code tended to hide problems elsewhere):

          ------------------------------------------------------------------------
          r662947 | bpendleton | 2008-06-04 04:44:17 +0200 (Wed, 04 Jun 2008) | 4 lines

          DERBY-3097: Remove unnecessary if stmt from BaseActivation.getColumnFromRow

          This change removes an unnecessary if statement from BaseActivation.java.

          ------------------------------------------------------------------------

          Show
          Knut Anders Hatlen added a comment - The NPE in this issue first occurred after this commit (might not be the root cause, though, since the comments in that issue indicate that the original code tended to hide problems elsewhere): ------------------------------------------------------------------------ r662947 | bpendleton | 2008-06-04 04:44:17 +0200 (Wed, 04 Jun 2008) | 4 lines DERBY-3097 : Remove unnecessary if stmt from BaseActivation.getColumnFromRow This change removes an unnecessary if statement from BaseActivation.java. ------------------------------------------------------------------------
          Hide
          Dag H. Wanvik added a comment -

          Yes, I just found this too, Knut! Seems we were both chasing it down, lol. There were some comments in the removed code:

          /* This actually happens. NoPutResultSetImpl.clearOrderableCache attempts to prefetch invariant values

          • into a cache. This fails in some deeply nested joins. See Beetle 4736 and 4880.
            */

          So, it seems this is still the case.

          Show
          Dag H. Wanvik added a comment - Yes, I just found this too, Knut! Seems we were both chasing it down, lol. There were some comments in the removed code: /* This actually happens. NoPutResultSetImpl.clearOrderableCache attempts to prefetch invariant values into a cache. This fails in some deeply nested joins. See Beetle 4736 and 4880. */ So, it seems this is still the case.
          Hide
          Dag H. Wanvik added a comment -

          Feel free to pick it up

          Show
          Dag H. Wanvik added a comment - Feel free to pick it up
          Hide
          Bryan Pendleton added a comment -

          Following along the lines of DERBY-3097, perhaps there is some way that
          NoPutResultSetImpl.clearOrderableCache() can check (numOpens == 0)
          to be aware that the result set has not yet been opened, and can then
          initialize the orderable cache in some other fashion (e.g., by calling
          DataValueDescriptor.getNewNull() on the various Qualifier items in the cache?

          I'm just throwing out ideas here, don't even know if they make any sense.

          Alternatively, since this appears to have to do with invariant qualifiers, maybe
          the code can check the qualifiers in the cache to see if they are variant or
          invariant, and can make the getOrderable() call only for invariant qualifiers.
          (Is it safe to call Qualifier.getOrderable for an invariant qualifier even if the
          containing result set has not been opened yet?)

          Show
          Bryan Pendleton added a comment - Following along the lines of DERBY-3097 , perhaps there is some way that NoPutResultSetImpl.clearOrderableCache() can check (numOpens == 0) to be aware that the result set has not yet been opened, and can then initialize the orderable cache in some other fashion (e.g., by calling DataValueDescriptor.getNewNull() on the various Qualifier items in the cache? I'm just throwing out ideas here, don't even know if they make any sense. Alternatively, since this appears to have to do with invariant qualifiers, maybe the code can check the qualifiers in the cache to see if they are variant or invariant, and can make the getOrderable() call only for invariant qualifiers. (Is it safe to call Qualifier.getOrderable for an invariant qualifier even if the containing result set has not been opened yet?)
          Hide
          Dag H. Wanvik added a comment -

          Thanks for these great ideas, Bryan! The removed code does seem like a kludge, so it would be nice to solve this the right way. Accessing rows from an unopened rs seems weird in deed.

          Show
          Dag H. Wanvik added a comment - Thanks for these great ideas, Bryan! The removed code does seem like a kludge, so it would be nice to solve this the right way. Accessing rows from an unopened rs seems weird in deed.
          Hide
          Dag H. Wanvik added a comment -

          Linking also to DERBY-3033, wince there's where a lot of discussion on the removed code happened.

          Show
          Dag H. Wanvik added a comment - Linking also to DERBY-3033 , wince there's where a lot of discussion on the removed code happened.
          Hide
          Dag H. Wanvik added a comment -

          The call from NoPutResultSetImpl.clearOrderableCache() to Qualifier.getOrderable isn't the only call site that relies on the removed lines. Tracing shows that also the call to Qualifier.getOrderable from
          HashScanResultSet.getNextRowCore (in the example query) will provoke the NPE.
          Cf. Kathey's stack trace in DERBY-3033 for old bug #4736:

          at java.lang.Thread.dumpStack(Thread.java:1206)
          at org.apache.derby.impl.sql.execute.BaseActivation.getColumnFromRow(BaseActivation.java:1484)
          at org.apache.derby.exe.ac0b5b0099x012bx02f8x63e3x000003d2b1a047.e19(Unknown Source)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.apache.derby.impl.services.reflect.ReflectMethod.invoke(ReflectMethod.java:46)
          at org.apache.derby.impl.sql.execute.GenericQualifier.getOrderable(GenericQualifier.java:96)
          at org.apache.derby.impl.sql.execute.HashScanResultSet.getNextRowCore(HashScanResultSet.java:395)
          at org.apache.derby.impl.sql.execute.NestedLoopLeftOuterJoinResultSet.getNextRowCore(NestedLoopLeftOuterJoinResultSet.java:104)
          at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(ProjectRestrictResultSet.java:261)
          at org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.getNextRow(BasicNoPutResultSetImpl.java:477)
          at org.apache.derby.impl.jdbc.EmbedResultSet.movePosition(EmbedResultSet.java:429)
          at org.apache.derby.impl.jdbc.EmbedResultSet.next(EmbedResultSet.java:373)

          Show
          Dag H. Wanvik added a comment - The call from NoPutResultSetImpl.clearOrderableCache() to Qualifier.getOrderable isn't the only call site that relies on the removed lines. Tracing shows that also the call to Qualifier.getOrderable from HashScanResultSet.getNextRowCore (in the example query) will provoke the NPE. Cf. Kathey's stack trace in DERBY-3033 for old bug #4736: at java.lang.Thread.dumpStack(Thread.java:1206) at org.apache.derby.impl.sql.execute.BaseActivation.getColumnFromRow(BaseActivation.java:1484) at org.apache.derby.exe.ac0b5b0099x012bx02f8x63e3x000003d2b1a047.e19(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.derby.impl.services.reflect.ReflectMethod.invoke(ReflectMethod.java:46) at org.apache.derby.impl.sql.execute.GenericQualifier.getOrderable(GenericQualifier.java:96) at org.apache.derby.impl.sql.execute.HashScanResultSet.getNextRowCore(HashScanResultSet.java:395) at org.apache.derby.impl.sql.execute.NestedLoopLeftOuterJoinResultSet.getNextRowCore(NestedLoopLeftOuterJoinResultSet.java:104) at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(ProjectRestrictResultSet.java:261) at org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.getNextRow(BasicNoPutResultSetImpl.java:477) at org.apache.derby.impl.jdbc.EmbedResultSet.movePosition(EmbedResultSet.java:429) at org.apache.derby.impl.jdbc.EmbedResultSet.next(EmbedResultSet.java:373)
          Hide
          Bryan Pendleton added a comment -

          Attaching an experimental diff. This patch avoids calling getOrderable
          in situations where we know there is no current row.

          I haven't done any testing of the patch beyond verifying that it
          avoids the NPE with the repro query.

          I'm attaching it hoping it provides some ideas for further exploration.

          Show
          Bryan Pendleton added a comment - Attaching an experimental diff. This patch avoids calling getOrderable in situations where we know there is no current row. I haven't done any testing of the patch beyond verifying that it avoids the NPE with the repro query. I'm attaching it hoping it provides some ideas for further exploration.
          Hide
          Dag H. Wanvik added a comment - - edited

          Thanks for the idea, Bryan! Unfortunately, the patch breaks e.g. OuterJoinTest. By tracing, it looks
          like to me to be hitting a lot more cases, than are eventually bailed out by the removed code in BaseActivation. In the example, there are 16 cases, all for result set 5, that require the bail out.
          I presume the cut-off in your experiment hits cases that would eventually fill the current row, as so misses some code that needs to to be run, cf the failures in OuterJoinTest I see, but I am barely starting to understand what's going on here.. I would like to understand why this "cut-off" (as in your experiment) is not normally needed (or is harmful), but is for this particular RS ("deeply nested join rs", if comments are to be believed).

          Show
          Dag H. Wanvik added a comment - - edited Thanks for the idea, Bryan! Unfortunately, the patch breaks e.g. OuterJoinTest. By tracing, it looks like to me to be hitting a lot more cases, than are eventually bailed out by the removed code in BaseActivation. In the example, there are 16 cases, all for result set 5, that require the bail out. I presume the cut-off in your experiment hits cases that would eventually fill the current row, as so misses some code that needs to to be run, cf the failures in OuterJoinTest I see, but I am barely starting to understand what's going on here.. I would like to understand why this "cut-off" (as in your experiment) is not normally needed (or is harmful), but is for this particular RS ("deeply nested join rs", if comments are to be believed).
          Hide
          Dag H. Wanvik added a comment -

          Pploading a diff to OuterJoinTest that includes a fixture with the problematic query in this issue. If applied to trunk, the test should provoke the NPE.

          Show
          Dag H. Wanvik added a comment - Pploading a diff to OuterJoinTest that includes a fixture with the problematic query in this issue. If applied to trunk, the test should provoke the NPE.
          Hide
          Bryan Pendleton added a comment -

          I think the extra tests will be quite helpful.

          I don't really understand the concept of the "orderable cache",
          nor the concept of "prefetching invariant values", so I'm not
          really sure what this code (in HashScanResultSet and
          NoPutResultSetImpl) is attempting to achieve. That's probably
          crucial to understand in order to change these behaviors.

          For now, the best course is probably to back out the
          DERBY-3097 change, since it demonstrably breaks for
          these query types, until we have the time to figure out what
          this part of the system is doing.

          Show
          Bryan Pendleton added a comment - I think the extra tests will be quite helpful. I don't really understand the concept of the "orderable cache", nor the concept of "prefetching invariant values", so I'm not really sure what this code (in HashScanResultSet and NoPutResultSetImpl) is attempting to achieve. That's probably crucial to understand in order to change these behaviors. For now, the best course is probably to back out the DERBY-3097 change, since it demonstrably breaks for these query types, until we have the time to figure out what this part of the system is doing.
          Hide
          Dag H. Wanvik added a comment - - edited

          Nor do I understand those concepts yet, so I agree it's probably best to back out the removal from DERBY-3097. I will add a reference in the "reverted" code to the new test which would break if the code is not there as help of future explorers.

          Meanwhile, I upload some files I produced so far:

          a) the optimized tree pinpointing (see arrow <-----) the result set (5) and column x3 (3)
          whose retrieval is the the problem in this query. Rs 5, which contains 3
          columns; x1, x2 and x3, represents this tree:

          t1 LEFT OUTER JOIN (t2 LEFT OUTER JOIN t3 ON t2.x2 = t3.x3 ) ON t1.x1 = t2.x2

          "derby-4798-optimized-tree.txt"

          b) the query plan with the bailout code in place in BaseActivation:
          "derby4798-whence-rs-5-and-execution-plan.txt"

          c) the disassembled generated code for this query:
          "derby-4798-generated-code.txt"

          Show
          Dag H. Wanvik added a comment - - edited Nor do I understand those concepts yet, so I agree it's probably best to back out the removal from DERBY-3097 . I will add a reference in the "reverted" code to the new test which would break if the code is not there as help of future explorers. Meanwhile, I upload some files I produced so far: a) the optimized tree pinpointing (see arrow <-----) the result set (5) and column x3 (3) whose retrieval is the the problem in this query. Rs 5, which contains 3 columns; x1, x2 and x3, represents this tree: t1 LEFT OUTER JOIN (t2 LEFT OUTER JOIN t3 ON t2.x2 = t3.x3 ) ON t1.x1 = t2.x2 "derby-4798-optimized-tree.txt" b) the query plan with the bailout code in place in BaseActivation: "derby4798-whence-rs-5-and-execution-plan.txt" c) the disassembled generated code for this query: "derby-4798-generated-code.txt"
          Hide
          Dag H. Wanvik added a comment -

          Uploading a patch, derby-4798a, which reintroduces the bailout code in BaseActivation, plus adds the repro from this issue to OuterJoinTest. Running regressions.

          Show
          Dag H. Wanvik added a comment - Uploading a patch, derby-4798a, which reintroduces the bailout code in BaseActivation, plus adds the repro from this issue to OuterJoinTest. Running regressions.
          Hide
          Dag H. Wanvik added a comment -

          Tests passed modulo Heisenbug DERBY-2949.

          Show
          Dag H. Wanvik added a comment - Tests passed modulo Heisenbug DERBY-2949 .
          Hide
          Dag H. Wanvik added a comment -

          Committed as svn 998170.

          Show
          Dag H. Wanvik added a comment - Committed as svn 998170.
          Hide
          Dag H. Wanvik added a comment -

          Backported to 10.6 as svn 998475, closing.

          Show
          Dag H. Wanvik added a comment - Backported to 10.6 as svn 998475, closing.
          Hide
          Kathey Marsden added a comment -

          Reopen for backport.

          Show
          Kathey Marsden added a comment - Reopen for backport.
          Hide
          Kathey Marsden added a comment -

          Here is a patch for this issue for 10.5. The code merge for revision 998170 was clean but the test needed to be moved from OuterJoinTest to JoinTest in 10.5.

          Show
          Kathey Marsden added a comment - Here is a patch for this issue for 10.5. The code merge for revision 998170 was clean but the test needed to be moved from OuterJoinTest to JoinTest in 10.5.
          Hide
          Kathey Marsden added a comment -

          Resolving after merge to 10.5. Thank you Dag for the fix!

          Show
          Kathey Marsden added a comment - Resolving after merge to 10.5. Thank you Dag for the fix!
          Hide
          Knut Anders Hatlen added a comment -

          Assigning back to Dag.

          Show
          Knut Anders Hatlen added a comment - Assigning back to Dag.
          Hide
          Knut Anders Hatlen added a comment -

          [bulk update] Close all resolved issues that haven't been updated for more than one year.

          Show
          Knut Anders Hatlen added a comment - [bulk update] Close all resolved issues that haven't been updated for more than one year.

            People

            • Assignee:
              Dag H. Wanvik
              Reporter:
              Dag H. Wanvik
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development