Derby
  1. Derby
  2. DERBY-4960

Race condition in FileContainer#allocCache when reopening RAFContainer after interrupt

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.8.1.2
    • Fix Version/s: 10.8.1.2
    • Component/s: Store
    • Labels:
      None

      Description

      The symptom is an ArrayIndexOutOfBoundsException:

      java.lang.ArrayIndexOutOfBoundsException: -1
      at org.apache.derby.impl.store.raw.data.AllocationCache.validate(AllocationCache.java:581)
      at org.apache.derby.impl.store.raw.data.AllocationCache.getLastPageNumber(AllocationCache.java:122)
      at org.apache.derby.impl.store.raw.data.FileContainer.pageValid(FileContainer.java:2067)
      at org.apache.derby.impl.store.raw.data.FileContainer.getUserPage(FileContainer.java:2522)
      at org.apache.derby.impl.store.raw.data.FileContainer.getInsertablePage(FileContainer.java:2867)
      at org.apache.derby.impl.store.raw.data.FileContainer.getPageForInsert(FileContainer.java:3017)
      at org.apache.derby.impl.store.raw.data.BaseContainerHandle.getPageForInsert(BaseContainerHandle.java:372)
      at org.apache.derby.impl.store.access.heap.HeapController.doInsert(HeapController.java:244)
      at org.apache.derby.impl.store.access.heap.HeapController.insertAndFetchLocation(HeapController.java:599)
      at org.apache.derby.impl.sql.execute.RowChangerImpl.insertRow(RowChangerImpl.java:452)
      at org.apache.derby.impl.sql.execute.InsertResultSet.normalInsertCore(InsertResultSet.java:1028)
      at org.apache.derby.impl.sql.execute.InsertResultSet.open(InsertResultSet.java:505)
      at org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(GenericPreparedStatement.java:436)
      at org.apache.derby.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java:317)
      at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1241)
      at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(EmbedPreparedStatement.java:1686)
      at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(EmbedPreparedStatement.java:308)
      at InterruptTest$WorkerThread.run(InterruptTest.java:261

      This can only happen if another thread has called allocCache.reset while the thread above is in the loop in validate, so as to set numExtents to 0.
      The synchronization of allocCache is documented in the Javadoc of the FileContainer class: all accesses to allocCache should synchronize.
      This is omitted when we reopen: FileContainer#openContainer calls readHeader -> readHeaderFromArray -> allocCache.reset

      1. InterruptTest.java
        17 kB
        Dag H. Wanvik
      2. derby-4960-1.diff
        6 kB
        Dag H. Wanvik
      3. derby-4960-1.stat
        0.4 kB
        Dag H. Wanvik
      4. derby-4960-2.diff
        4 kB
        Dag H. Wanvik
      5. derby-4960-2.stat
        0.2 kB
        Dag H. Wanvik

        Issue Links

          Activity

          Gavin made changes -
          Workflow jira [ 12541798 ] Default workflow, editable Closed status [ 12801018 ]
          Rick Hillegas made changes -
          Affects Version/s 10.8.1.2 [ 12316362 ]
          Affects Version/s 10.8.1.1 [ 12316356 ]
          Fix Version/s 10.8.1.2 [ 12316362 ]
          Fix Version/s 10.8.1.1 [ 12316356 ]
          Rick Hillegas made changes -
          Affects Version/s 10.8.1.1 [ 12316356 ]
          Affects Version/s 10.8.1.0 [ 12315561 ]
          Fix Version/s 10.8.1.1 [ 12316356 ]
          Fix Version/s 10.8.1.0 [ 12315561 ]
          Kathey Marsden made changes -
          Fix Version/s 10.8.0.0 [ 12315561 ]
          Dag H. Wanvik made changes -
          Status Open [ 1 ] Closed [ 6 ]
          Resolution Fixed [ 1 ]
          Hide
          Dag H. Wanvik added a comment -

          Committed version 2 as svn 1056591, closing.

          Show
          Dag H. Wanvik added a comment - Committed version 2 as svn 1056591, closing.
          Dag H. Wanvik made changes -
          Attachment derby-4960-2.diff [ 12467782 ]
          Attachment derby-4960-2.stat [ 12467783 ]
          Hide
          Dag H. Wanvik added a comment -

          Uploading version 2, simplified.

          Show
          Dag H. Wanvik added a comment - Uploading version 2, simplified.
          Hide
          Dag H. Wanvik added a comment -

          Regressions passed. Note that the patch in itself does not make the repro work, since there is still outstanding parts of DERBY-4741 that need to be committed first (I saw the error while preparing a patch for those outstanding parts, sorry, I should have been clear about that). I'll still commit this fix though, incremental improvement

          Show
          Dag H. Wanvik added a comment - Regressions passed. Note that the patch in itself does not make the repro work, since there is still outstanding parts of DERBY-4741 that need to be committed first (I saw the error while preparing a patch for those outstanding parts, sorry, I should have been clear about that). I'll still commit this fix though, incremental improvement
          Dag H. Wanvik made changes -
          Attachment derby-4960-1.diff [ 12467734 ]
          Attachment derby-4960-1.stat [ 12467735 ]
          Hide
          Dag H. Wanvik added a comment -

          Uploading a patch, derby-4960-1, which omits reading the page header when reopening the container after an interrupt. This sidesteps the problem seen. An new "reopenContainer" method is used for this purpose. Additionally the patch changes a couple of "private final" to just private, since final there is redundant.

          Running regressions.

          Show
          Dag H. Wanvik added a comment - Uploading a patch, derby-4960-1, which omits reading the page header when reopening the container after an interrupt. This sidesteps the problem seen. An new "reopenContainer" method is used for this purpose. Additionally the patch changes a couple of "private final" to just private, since final there is redundant. Running regressions.
          Dag H. Wanvik made changes -
          Assignee Dag H. Wanvik [ dagw ]
          Dag H. Wanvik made changes -
          Attachment InterruptTest.java [ 12467694 ]
          Hide
          Dag H. Wanvik added a comment -

          For the record, to reproduce this I ran the attached InterruptTest thus:

          java -XX:-UseVMInterruptibleIO -cp $CLASSPATH:. InterruptTest lock

          Show
          Dag H. Wanvik added a comment - For the record, to reproduce this I ran the attached InterruptTest thus: java -XX:-UseVMInterruptibleIO -cp $CLASSPATH:. InterruptTest lock
          Hide
          Dag H. Wanvik added a comment -

          Trying to synchronize here gave deadlock instead:

          One thread owns allocCache and is wait for latch. Another thread owns the latch, sees interrupt and tries to reopen the container and in that process need to call allocCache.reset: hence deadlock.

          Probably we can skip the call to allocCache.reset: when we reopen, since we are not really reusing the file container object for another container, we are just reopening it...

          Show
          Dag H. Wanvik added a comment - Trying to synchronize here gave deadlock instead: One thread owns allocCache and is wait for latch. Another thread owns the latch, sees interrupt and tries to reopen the container and in that process need to call allocCache.reset: hence deadlock. Probably we can skip the call to allocCache.reset: when we reopen, since we are not really reusing the file container object for another container, we are just reopening it...
          Dag H. Wanvik made changes -
          Field Original Value New Value
          Link This issue is part of DERBY-4741 [ DERBY-4741 ]
          Dag H. Wanvik created issue -

            People

            • Assignee:
              Dag H. Wanvik
              Reporter:
              Dag H. Wanvik
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development