Derby
  1. Derby
  2. DERBY-5284

A derby crash at exactly right time during a btree split can cause a corrupt db which can not be booted.

    Details

    • Issue & fix info:
      High Value Fix, Patch Available
    • Bug behavior facts:
      Crash, Data corruption

      Description

      A derby crash at exactly wrong time during a btree split can cause a corrupt db which can not be booted.

      A problem in the split code and exact wrong timing of a crash can leave the database in as state
      where undo of purge operations corrupts index pages during redo and can cause recovery boot
      to never succeed and thus the database never to be booted. At hight level what happens is that
      a purge happens on a page and before it commits another transactions uses the space of the
      purge to do an insert and then commits, then the system crashes before the purging transactions
      gets a chance to commit. During undo the purge expects there to be space to undo the purge
      but there is not, and it corrupts the page in various ways depending on the size and placement
      of the inserts. The error that actually returns to user varies from sane to insane as the problem
      is actually noticed after the corruption occurs rather than during the undo.

        Issue Links

          Activity

          Hide
          Mike Matrigali added a comment -

          I found this problem by code inspection and have not reproduced it myself. This problem is another
          code path of the same problem fixed by DERBY-5258. I believe that either DERBY-5258 or this
          issue are causing the problems that have been reported as DERBY-5281 and DERBY-5248. See
          those 2 issues for detailed description of order of log records and crash timing necessary to
          reproduce this problem.

          At high abstract level the current code does:

          get latch
          purge row
          release latch
          close table
          commit

          If another transaction gets the latch and inserts rows between the release latch and the commit and the system crashes before
          the commit then this problem can happen. The fix is to not release the latch, and let commit release it.

          Show
          Mike Matrigali added a comment - I found this problem by code inspection and have not reproduced it myself. This problem is another code path of the same problem fixed by DERBY-5258 . I believe that either DERBY-5258 or this issue are causing the problems that have been reported as DERBY-5281 and DERBY-5248 . See those 2 issues for detailed description of order of log records and crash timing necessary to reproduce this problem. At high abstract level the current code does: get latch purge row release latch close table commit If another transaction gets the latch and inserts rows between the release latch and the commit and the system crashes before the commit then this problem can happen. The fix is to not release the latch, and let commit release it.
          Hide
          Mike Matrigali added a comment - - edited

          patch for this issue. Full set of tests on this patch against trunk passed.

          Show
          Mike Matrigali added a comment - - edited patch for this issue. Full set of tests on this patch against trunk passed.
          Hide
          Mike Matrigali added a comment -

          DERBY-5258 fixing the same kind of problem in the background cleaner code path.

          Show
          Mike Matrigali added a comment - DERBY-5258 fixing the same kind of problem in the background cleaner code path.
          Hide
          Mike Matrigali added a comment -

          This issue could cause the problem reported in either of DERBY-5281 or DERBY-5248. I can't tell for sure from the information provided. And without a repro that I can run can't be absolutely sure this is their problem.

          Show
          Mike Matrigali added a comment - This issue could cause the problem reported in either of DERBY-5281 or DERBY-5248 . I can't tell for sure from the information provided. And without a repro that I can run can't be absolutely sure this is their problem.
          Hide
          Mike Matrigali added a comment -

          fixed in trunk.

          Sending java\engine\org\apache\derby\impl\store\access\btree\BTreeController.java
          Sending java\engine\org\apache\derby\impl\store\access\heap\HeapCompressScan.java
          Sending java\engine\org\apache\derby\impl\store\access\heap\HeapPostCommit.java
          Transmitting file data ...
          Committed revision 1138275.

          Show
          Mike Matrigali added a comment - fixed in trunk. Sending java\engine\org\apache\derby\impl\store\access\btree\BTreeController.java Sending java\engine\org\apache\derby\impl\store\access\heap\HeapCompressScan.java Sending java\engine\org\apache\derby\impl\store\access\heap\HeapPostCommit.java Transmitting file data ... Committed revision 1138275.
          Hide
          Mike Matrigali added a comment -

          backported fix from trunk to 10.8 branch, clean merge.

          Sending java\engine\org\apache\derby\impl\store\access\btree\BTreeController.java
          Sending java\engine\org\apache\derby\impl\store\access\heap\HeapCompressScan.java
          Sending java\engine\org\apache\derby\impl\store\access\heap\HeapPostCommit.java
          Transmitting file data ...
          Committed revision 1138570.

          Show
          Mike Matrigali added a comment - backported fix from trunk to 10.8 branch, clean merge. Sending java\engine\org\apache\derby\impl\store\access\btree\BTreeController.java Sending java\engine\org\apache\derby\impl\store\access\heap\HeapCompressScan.java Sending java\engine\org\apache\derby\impl\store\access\heap\HeapPostCommit.java Transmitting file data ... Committed revision 1138570.
          Hide
          Mike Matrigali added a comment -

          back ported to 10.7 branch. clean merge.

          Sending java\engine\org\apache\derby\impl\store\access\btree\BTreeController.java
          Sending java\engine\org\apache\derby\impl\store\access\heap\HeapCompressScan.java
          Sending java\engine\org\apache\derby\impl\store\access\heap\HeapPostCommit.java
          Transmitting file data ...
          Committed revision 1138701.

          Show
          Mike Matrigali added a comment - back ported to 10.7 branch. clean merge. Sending java\engine\org\apache\derby\impl\store\access\btree\BTreeController.java Sending java\engine\org\apache\derby\impl\store\access\heap\HeapCompressScan.java Sending java\engine\org\apache\derby\impl\store\access\heap\HeapPostCommit.java Transmitting file data ... Committed revision 1138701.
          Hide
          Mike Matrigali added a comment -

          backported fix to 10.6 and 10.5 branch. Both were clean merges.

          Show
          Mike Matrigali added a comment - backported fix to 10.6 and 10.5 branch. Both were clean merges.
          Hide
          Mike Matrigali added a comment -

          backported fix from trunk to 10.4, minor conflict resolution required in merge.

          s104_jdk16:22>svn commit

          Sending java\engine\org\apache\derby\impl\store\access\btree\BTreeController.java
          Sending java\engine\org\apache\derby\impl\store\access\heap\HeapCompressScan.java
          Sending java\engine\org\apache\derby\impl\store\access\heap\HeapPostCommit.java
          Transmitting file data ...
          Committed revision 1139085.

          Show
          Mike Matrigali added a comment - backported fix from trunk to 10.4, minor conflict resolution required in merge. s104_jdk16:22>svn commit Sending java\engine\org\apache\derby\impl\store\access\btree\BTreeController.java Sending java\engine\org\apache\derby\impl\store\access\heap\HeapCompressScan.java Sending java\engine\org\apache\derby\impl\store\access\heap\HeapPostCommit.java Transmitting file data ... Committed revision 1139085.
          Hide
          Mike Matrigali added a comment -

          backported fix from 10.4 branch to the 10.3 and 10.2 branches. By doing the merge from 10.4 where I had done a conflict resolution I got clean merges into both 10.3 and 10.2.

          Show
          Mike Matrigali added a comment - backported fix from 10.4 branch to the 10.3 and 10.2 branches. By doing the merge from 10.4 where I had done a conflict resolution I got clean merges into both 10.3 and 10.2.
          Hide
          Mike Matrigali added a comment -

          merged change from 10.4 branch to 10.1 branch.

          s101_ibm16:9>svn commit

          Sending java\engine\org\apache\derby\impl\store\access\btree\BTreeController.java
          Sending java\engine\org\apache\derby\impl\store\access\heap\HeapPostCommit.java
          Transmitting file data ..
          Committed revision 1139109.

          Show
          Mike Matrigali added a comment - merged change from 10.4 branch to 10.1 branch. s101_ibm16:9>svn commit Sending java\engine\org\apache\derby\impl\store\access\btree\BTreeController.java Sending java\engine\org\apache\derby\impl\store\access\heap\HeapPostCommit.java Transmitting file data .. Committed revision 1139109.
          Hide
          Mike Matrigali added a comment -

          Fixed and backported to all branches from 10.1 to 10.8.

          Show
          Mike Matrigali added a comment - Fixed and backported to all branches from 10.1 to 10.8.
          Hide
          Knut Anders Hatlen added a comment -

          [bulk update] Close all resolved issues that haven't been updated for more than one year.

          Show
          Knut Anders Hatlen added a comment - [bulk update] Close all resolved issues that haven't been updated for more than one year.

            People

            • Assignee:
              Mike Matrigali
              Reporter:
              Mike Matrigali
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development