Derby
  1. Derby
  2. DERBY-4577

An expanding update fails with an nospc.U error.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.6.1.0
    • Fix Version/s: 10.7.1.1
    • Component/s: Store
    • Labels:
      None
    • Issue & fix info:
      High Value Fix, Release Note Needed, Repro attached

      Description

      An update of a long row piece on an overflow page that has limited row used space, row reserved space, and page free space can throw the nospc.U error on an update. I will attach a patch with a simple junit test that reproduces this
      issue.

      This issue is probably the same as DERBY-2286. Logging a new issue so that it is clear the associated changes fix
      the attached repro. The repro for DERBY-2286 is random and I could only get it to fail on one machine. If after the fix
      for this goes in and no one can repro, then we can close DERBY-2286 as a repro.

      The nospc errror is never meant to get to the user. The code path that can raise it is shared by insert and update.
      Insert has code that internally catches the error and does the right thing. The error should never be raised for an update. What is meant to happen for updates is that on update on any page, including an overflow page should always
      in the worst case have enough room to transform the row piece from whatever it is to a just a row header with an
      overflow pointer to the next piece.

      For the attached test the row piece on the overflow page is 12 bytes reserved, and 0 bytes free on the page. So
      far in the code path the code has written a preliminary row header that is 4 bytes, and then has noticed that the
      remaining 8 bytes are not enough for an overflow pointer (worst case 8 bytes for page number + 4 bytes for record
      id).

      I think the real problem is that not enough minimum space is being reserved on the overflow page. There should be
      12 bytes reserved + the maximum size of a record header that does not include an overflow pointer. I think the
      code does the right thing in the case of head pages where the minimum reserved space is 12 for the "user" portion
      of the data, but it looks like we don't apply this reserved space to just user portion on overflow pages.

      I need to research more, but it looks like on overflow pages we apply the minimum reserved space to the entire row
      rather than just the user space.

      The stack trace being thrown in this case is:
      2010-03-06 00:18:40.750 GMT:
      Booting Derby version The Apache Software Foundation - Apache Derby - 10.6.0.0 alpha - (1): instance 3405c0cb-0127-30d6-6168-ffffe7
      008b2c
      on database directory C:\derby\s2\systest\out\junit\system\wombat
      ^M
      Database Class Loader started - derby.database.classpath=''^M
      2010-03-06 00:18:41.171 GMT Thread[main,5,main] (XID = 253), (SESSIONID = 1), (DATABASE = wombat), (DRDAID = null), Cleanup action s
      tarting^M
      2010-03-06 00:18:41.171 GMT Thread[main,5,main] (XID = 253), (SESSIONID = 1), (DATABASE = wombat), (DRDAID = null), Failed Statement
      is: UPDATE testBadUpdate set value = ? where id = ? with 2 parameters begin parameter #1: BLOB:Length=120000 :end parameter begin p
      arameter #2: 0 :end parameter ^M
      ERROR nospc: nospc.U^M
      at org.apache.derby.impl.store.raw.data.StoredPage.logRow(StoredPage.java:4106)^M
      at org.apache.derby.impl.store.raw.data.UpdateOperation.writeOptionalDataToBuffer(UpdateOperation.java:255)^M
      at org.apache.derby.impl.store.raw.data.UpdateOperation.<init>(UpdateOperation.java:106)^M
      at org.apache.derby.impl.store.raw.data.LoggableActions.actionUpdate(LoggableActions.java:80)^M
      at org.apache.derby.impl.store.raw.data.StoredPage.doUpdateAtSlot(StoredPage.java:8551)^M
      at org.apache.derby.impl.store.raw.data.BasePage.updateAtSlot(BasePage.java:1062)^M
      at org.apache.derby.impl.store.access.conglomerate.GenericConglomerateController.replace(GenericConglomerateController.java:472)
      ^M
      at org.apache.derby.impl.sql.execute.RowChangerImpl.updateRow(RowChangerImpl.java:523)^M
      at org.apache.derby.impl.sql.execute.UpdateResultSet.collectAffectedRows(UpdateResultSet.java:554)^M
      at org.apache.derby.impl.sql.execute.UpdateResultSet.open(UpdateResultSet.java:254)^M
      at org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(GenericPreparedStatement.java:436)^M
      at org.apache.derby.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java:317)^M
      at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1232)^M
      at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(EmbedPreparedStatement.java:1673)^M
      at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(EmbedPreparedStatement.java:303)^M
      at org.apache.derbyTesting.functionTests.tests.store.DerbyBugTest2.run_one(DerbyBugTest2.java:224)^M
      at org.apache.derbyTesting.functionTests.tests.store.DerbyBugTest2.testOne(DerbyBugTest2.java:84)^M
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)^M
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:48)^M
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)^M
      at java.lang.reflect.Method.invoke(Method.java:600)^M
      at junit.framework.TestCase.runTest(TestCase.java:154)^M
      at junit.framework.TestCase.runBare(TestCase.java:127)^M
      at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:109)^M
      at junit.framework.TestResult$1.protect(TestResult.java:106)^M
      at junit.framework.TestResult.runProtected(TestResult.java:124)^M
      at junit.framework.TestResult.run(TestResult.java:109)^M
      at junit.framework.TestCase.run(TestCase.java:118)^M
      at junit.framework.TestSuite.runTest(TestSuite.java:208)^M
      at junit.framework.TestSuite.run(TestSuite.java:203)^M
      at junit.framework.TestSuite.runTest(TestSuite.java:208)^M
      at junit.framework.TestSuite.run(TestSuite.java:203)^M
      at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)^M
      at junit.extensions.TestSetup$1.protect(TestSetup.java:19)^M
      at junit.framework.TestResult.runProtected(TestResult.java:124)^M
      at junit.extensions.TestSetup.run(TestSetup.java:23)^M
      at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)^M
      at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)^M
      at junit.extensions.TestSetup$1.protect(TestSetup.java:19)^M
      at junit.framework.TestResult.runProtected(TestResult.java:124)^M
      at junit.extensions.TestSetup.run(TestSetup.java:23)^M
      at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)^M
      at junit.framework.TestSuite.runTest(TestSuite.java:208)^M
      at junit.framework.TestSuite.run(TestSuite.java:203)^M
      at junit.textui.TestRunner.doRun(TestRunner.java:116)^M
      at junit.textui.TestRunner.start(TestRunner.java:172)^M
      at junit.textui.TestRunner.main(TestRunner.java:138)^M
      Cleanup action completed^M

      1. derby-4577.diff
        10 kB
        Mike Matrigali
      2. derby-4577_not_for_commit_fix.diff
        31 kB
        Mike Matrigali
      3. derby-4577_fix.diff
        37 kB
        Mike Matrigali
      4. derby-4577_fix2.diff
        36 kB
        Mike Matrigali
      5. releaseNote.html
        3 kB
        Mike Matrigali
      6. releaseNote.html
        3 kB
        Rick Hillegas

        Issue Links

          Activity

          Gavin made changes -
          Workflow jira [ 12501031 ] Default workflow, editable Closed status [ 12800438 ]
          Mike Matrigali made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Knut Anders Hatlen made changes -
          Link This issue is related to DERBY-6203 [ DERBY-6203 ]
          Knut Anders Hatlen made changes -
          Link This issue is related to DERBY-5858 [ DERBY-5858 ]
          Knut Anders Hatlen made changes -
          Link This issue is related to DERBY-5581 [ DERBY-5581 ]
          Rick Hillegas made changes -
          Fix Version/s 10.7.1.1 [ 12315564 ]
          Fix Version/s 10.7.1.0 [ 12314971 ]
          Rick Hillegas made changes -
          Attachment releaseNote.html [ 12459480 ]
          Hide
          Rick Hillegas added a comment -

          Updating the release note, incorporating Dag's comments on the 10.7.1 release notes.

          Show
          Rick Hillegas added a comment - Updating the release note, incorporating Dag's comments on the 10.7.1 release notes.
          Mike Matrigali made changes -
          Issue & fix info [High Value Fix] [High Value Fix, Release Note Needed, Repro attached]
          Mike Matrigali made changes -
          Attachment releaseNote.html [ 12448422 ]
          Hide
          Mike Matrigali added a comment -

          adding a release note.

          Show
          Mike Matrigali added a comment - adding a release note.
          Mike Matrigali made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Mike Matrigali made changes -
          Link This issue is duplicated by DERBY-2286 [ DERBY-2286 ]
          Mike Matrigali made changes -
          Fix Version/s 10.7.0.0 [ 12314971 ]
          Hide
          Mike Matrigali added a comment -

          change 959027 commits this fix to trunk.
          s1_ibm16:16>svn commit

          Sending java\engine\org\apache\derby\iapi\store\raw\RawStoreFactory.java
          Sending java\engine\org\apache\derby\impl\store\raw\data\StoredPage.java
          Sending java\engine\org\apache\derby\impl\store\raw\data\StoredRecordHead
          er.java
          Adding java\testing\org\apache\derbyTesting\functionTests\tests\store\Derby4577Test.java
          Sending java\testing\org\apache\derbyTesting\functionTests\tests\store_Suite.java
          Transmitting file data .....
          Committed revision 959027.

          Show
          Mike Matrigali added a comment - change 959027 commits this fix to trunk. s1_ibm16:16>svn commit Sending java\engine\org\apache\derby\iapi\store\raw\RawStoreFactory.java Sending java\engine\org\apache\derby\impl\store\raw\data\StoredPage.java Sending java\engine\org\apache\derby\impl\store\raw\data\StoredRecordHead er.java Adding java\testing\org\apache\derbyTesting\functionTests\tests\store\Derby4577Test.java Sending java\testing\org\apache\derbyTesting\functionTests\tests\store_Suite.java Transmitting file data ..... Committed revision 959027.
          Mike Matrigali made changes -
          Attachment derby-4577_fix2.diff [ 12448266 ]
          Hide
          Mike Matrigali added a comment - - edited

          Fixes problem in previous patch where main page rows were being affected during compact rows. Passes full set of tests on windows and linux. Also ran with no error against stress test submitted with DERBY-2286

          Show
          Mike Matrigali added a comment - - edited Fixes problem in previous patch where main page rows were being affected during compact rows. Passes full set of tests on windows and linux. Also ran with no error against stress test submitted with DERBY-2286
          Kristian Waagan made changes -
          Component/s Store [ 11412 ]
          Hide
          Kristian Waagan added a comment -

          For the record, my comment about Math.max was for improved readability only - it's less verbose and people know what the method does.

          Show
          Kristian Waagan added a comment - For the record, my comment about Math.max was for improved readability only - it's less verbose and people know what the method does.
          Hide
          Knut Anders Hatlen added a comment -

          I haven't tested this, but I think it is a fair assumption to expect that the code using the Math.max() method will end up getting optimized to the same native code as the hand-coded expression. Math.max() is a static method and should be easy to inline, according to this writeup [1] by the JVM experts. (It talks about static methods in general, though, not about Math.max() in particular. I read this under Methods: "Static, private, final, and/or "special" invocations are easy to inline.") And max(int,int) is implemented [2] just like the hand-crafted code in the patch, so inlining the call should be enough to get identical performance.

          [1] http://wikis.sun.com/display/HotSpotInternals/PerformanceTechniques
          [2] http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/tip/src/share/classes/java/lang/Math.java

          Show
          Knut Anders Hatlen added a comment - I haven't tested this, but I think it is a fair assumption to expect that the code using the Math.max() method will end up getting optimized to the same native code as the hand-coded expression. Math.max() is a static method and should be easy to inline, according to this writeup [1] by the JVM experts. (It talks about static methods in general, though, not about Math.max() in particular. I read this under Methods: "Static, private, final, and/or "special" invocations are easy to inline.") And max(int,int) is implemented [2] just like the hand-crafted code in the patch, so inlining the call should be enough to get identical performance. [1] http://wikis.sun.com/display/HotSpotInternals/PerformanceTechniques [2] http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/tip/src/share/classes/java/lang/Math.java
          Mike Matrigali made changes -
          Attachment derby-4577_fix.diff [ 12448127 ]
          Hide
          Mike Matrigali added a comment -

          The previous patch had one problem with btree root split, shown up when running
          the full suite in the databasemetadata tests, while running with SANE causing it to throw an assert. The root had gotten full and had some rows with less than 17 bytes and some more, and the copy to leaf routine was incorrectly applying the 17 byte minimum to all the copied rows and the copy did not work.

          The intent of the fix is to only apply the 17 byte minimum to overflow pages, so needed to change storeRecordForInsert to do so.

          This patch passed all tests run against a SANE against ibm16 on a windows machine.

          I ran the previous patch against the user provided repro in DERBY-2286 and it not fail in a couple of hours against a normal server on a 2 processor, fast linux box. I got a wierd logging error when I tried to run it against a durability=test build - I believe that is
          more about timing in the logging system than anything I am changing. I'll file something separate on that if I can reproduce.

          I will run some linux insane tests and look to commit to trunk sometime next week.

          I addressed the test issue brought up. Will look at doing something about mixed space and leading tabs after the fix. I just didn't think of the max routine, all this code is pretty low level critical - any opinions on if Math.max is as fast as hand coding using ? operator.

          Show
          Mike Matrigali added a comment - The previous patch had one problem with btree root split, shown up when running the full suite in the databasemetadata tests, while running with SANE causing it to throw an assert. The root had gotten full and had some rows with less than 17 bytes and some more, and the copy to leaf routine was incorrectly applying the 17 byte minimum to all the copied rows and the copy did not work. The intent of the fix is to only apply the 17 byte minimum to overflow pages, so needed to change storeRecordForInsert to do so. This patch passed all tests run against a SANE against ibm16 on a windows machine. I ran the previous patch against the user provided repro in DERBY-2286 and it not fail in a couple of hours against a normal server on a 2 processor, fast linux box. I got a wierd logging error when I tried to run it against a durability=test build - I believe that is more about timing in the logging system than anything I am changing. I'll file something separate on that if I can reproduce. I will run some linux insane tests and look to commit to trunk sometime next week. I addressed the test issue brought up. Will look at doing something about mixed space and leading tabs after the fix. I just didn't think of the max routine, all this code is pretty low level critical - any opinions on if Math.max is as fast as hand coding using ? operator.
          Hide
          Kristian Waagan added a comment -

          Hi Mike,

          Just had a quick look at the patch out of curiosity, I did not review the logic properly.
          I verified that the test succeeds with the patch, and fails without it.

          Some nits to consider for a next revision (if any):

          • any reason why you're not using Math.max() when calculating size_to_reserve?
          • mixing tabs and spaces for indentation on changed/new lines
          • wrong class name in license header for the test
          Show
          Kristian Waagan added a comment - Hi Mike, Just had a quick look at the patch out of curiosity, I did not review the logic properly. I verified that the test succeeds with the patch, and fails without it. Some nits to consider for a next revision (if any): any reason why you're not using Math.max() when calculating size_to_reserve? mixing tabs and spaces for indentation on changed/new lines wrong class name in license header for the test
          Mike Matrigali made changes -
          Attachment derby-4577_not_for_commit_fix.diff [ 12447988 ]
          Hide
          Mike Matrigali added a comment - - edited

          updated note - this fix did not pass tests. See subsequent patch.

          working copy of fix. preliminary, not ready for commit yet. still running tests. After
          further debugging the problem does seem to be isolated to overflow pages. This
          patch fixes the test case which is also in the patch. That test case was 100%
          reproducible on my machine.

          The overall issue is that the current design to handle expanding updates of rows on both main pages and overflow pages is that it is assumed that each piece can always in the worst case of an expanding update be updated to have a piece that is just an overflow row pointer with all the data moved to another overflow page. The worst case size of this is 17 bytes which is 1 byte status, 4 bytes record id, 8 bytes overflow page, 4 bytes oveflow record id. In some cases the space to do this is actually longer than the actual space of an existing row (ie. something like a 1 column blob that has no data in it).

          On main pages this is taken care of by the minimum reserve space. The main intent of minimum reserve space is to allow user to specify the amount of space to save for the "user" portion of the row. The system sets a minimum user reserve space of 12 for main page rows. This works around "reserve for overflow" problem as it means there is always space for just the overflow pointer part to come out of the user portion does not include the record header part.

          On overflow pages the code does not enforce minimum reserve space, because for user rows it does not make sense. The point of reserve space is to avoid oveflowing the row on expanding updates to another page. The system only maintains this reserve space on main pages.

          Thus a different mechanism is needed to also insure on overflow pages that there is enough space reserved for the "reserve for overflow" problem. This current patch does this by just making sure overflow rows on insert are always >= 17 bytes including reserve space for the row. It also changes the code that deals with reclaiming space from shrinking updates to also maintain this 17 byte minimum.

          Going through the code the use of minimum reserve space was confusing and I believe inconsistent. Sometimes it was applied to user space, sometimes to the whole row. The patch changes the use of this in a few places to make it more consistent.

          For anyone reviewing the patch, it is probably easier to look side by side at
          compactRecord() and checkRowReservedSpace() than at the diffs. Some reformats along with "real" code changes went into this area as I was having a
          hard time understanding the original logic.

          The goal of the change is to only affect the minimum length of rows on overflow pages, so that short rows on main pages should remain the same length. Also
          this fix only stops the bug from happening to future inserted rows. Existing
          overflow rows may have the problem. Running offline compress should fix an
          existing table that runs into the problem.

          Show
          Mike Matrigali added a comment - - edited updated note - this fix did not pass tests. See subsequent patch. working copy of fix. preliminary, not ready for commit yet. still running tests. After further debugging the problem does seem to be isolated to overflow pages. This patch fixes the test case which is also in the patch. That test case was 100% reproducible on my machine. The overall issue is that the current design to handle expanding updates of rows on both main pages and overflow pages is that it is assumed that each piece can always in the worst case of an expanding update be updated to have a piece that is just an overflow row pointer with all the data moved to another overflow page. The worst case size of this is 17 bytes which is 1 byte status, 4 bytes record id, 8 bytes overflow page, 4 bytes oveflow record id. In some cases the space to do this is actually longer than the actual space of an existing row (ie. something like a 1 column blob that has no data in it). On main pages this is taken care of by the minimum reserve space. The main intent of minimum reserve space is to allow user to specify the amount of space to save for the "user" portion of the row. The system sets a minimum user reserve space of 12 for main page rows. This works around "reserve for overflow" problem as it means there is always space for just the overflow pointer part to come out of the user portion does not include the record header part. On overflow pages the code does not enforce minimum reserve space, because for user rows it does not make sense. The point of reserve space is to avoid oveflowing the row on expanding updates to another page. The system only maintains this reserve space on main pages. Thus a different mechanism is needed to also insure on overflow pages that there is enough space reserved for the "reserve for overflow" problem. This current patch does this by just making sure overflow rows on insert are always >= 17 bytes including reserve space for the row. It also changes the code that deals with reclaiming space from shrinking updates to also maintain this 17 byte minimum. Going through the code the use of minimum reserve space was confusing and I believe inconsistent. Sometimes it was applied to user space, sometimes to the whole row. The patch changes the use of this in a few places to make it more consistent. For anyone reviewing the patch, it is probably easier to look side by side at compactRecord() and checkRowReservedSpace() than at the diffs. Some reformats along with "real" code changes went into this area as I was having a hard time understanding the original logic. The goal of the change is to only affect the minimum length of rows on overflow pages, so that short rows on main pages should remain the same length. Also this fix only stops the bug from happening to future inserted rows. Existing overflow rows may have the problem. Running offline compress should fix an existing table that runs into the problem.
          Mike Matrigali made changes -
          Attachment derby-4577.diff [ 12438066 ]
          Hide
          Mike Matrigali added a comment -

          patch with junit test that repro's the bug. not ready for commit.

          Show
          Mike Matrigali added a comment - patch with junit test that repro's the bug. not ready for commit.
          Hide
          Mike Matrigali added a comment -

          it is likely that this is same bug as is demonstrated on some machines by the test associated with DERBY-2286. Defintely has same stack.

          Show
          Mike Matrigali added a comment - it is likely that this is same bug as is demonstrated on some machines by the test associated with DERBY-2286 . Defintely has same stack.
          Mike Matrigali made changes -
          Field Original Value New Value
          Link This issue is related to DERBY-2286 [ DERBY-2286 ]
          Mike Matrigali created issue -

            People

            • Assignee:
              Mike Matrigali
              Reporter:
              Mike Matrigali
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development