Derby
  1. Derby
  2. DERBY-151

Thread termination -> XSDG after operation is 'complete'

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.0.2.1
    • Fix Version/s: 10.6.1.0
    • Component/s: Store
    • Labels:
      None
    • Environment:
      Linux kernel 2.4.21-243-athlon (SuSE 9.0)
    • Issue & fix info:
      High Value Fix, Repro attached
    • Bug behavior facts:
      Seen in production

      Description

      I've encountered what appears to be a bug related to threading. After an INSERT operation, if the invoking thread terminates too quickly, Derby throws an XSDG.

      The bug is a bit difficult to isolate but it occurs consistently in the following situation (with a particular database and an operation of a particular size):

      Derby is running in embedded mode with autocommit on.
      The application performs an INPUT operation from a thread that is not the main thread. The INPUT is issued using a PreparedStatement. The INPUT adds ~ 256 records of six fields each. (Note that INSERTs of this size seem to work fine in other contexts.)

      The preparedStatement.executeUpdate() seems to excute successfully; at least it returns without throwing an exception.

      The thread that invoked the INPUT operation then terminates (but NOT the application). The next INPUT operation then results in an

      "ERROR XSDG1: Page Page(7,Container(0, 1344)) could not be written to disk, please check if disk is full."

      The disk is definitely not full.

      HOWEVER, if I put the calling thread to sleep for a second before it exits, the problem does not occur.

      I'm not quite sure what to make of this. I was under the impression that most of Derby's activity occurs in the application's threads. Could Derby be creating a child thread from in the application thread, which dies when the parent thread terminates?

      Thanks

      1. derby-151-followup.diff
        5 kB
        Dag H. Wanvik
      2. derby-151-c.stat
        0.5 kB
        Dag H. Wanvik
      3. derby-151-c.diff
        14 kB
        Dag H. Wanvik
      4. derby-151-b.diff
        14 kB
        Dag H. Wanvik
      5. derby-151-b.stat
        0.5 kB
        Dag H. Wanvik
      6. derby-151-a.stat
        0.4 kB
        Dag H. Wanvik
      7. derby-151-a.diff
        9 kB
        Dag H. Wanvik
      8. Derby151Test.java
        2 kB
        Paul Sabatino
      9. d151.java
        0.6 kB
        Knut Anders Hatlen
      10. derby.log
        67 kB
        Jörg von Frantzius

        Issue Links

          Activity

          Hide
          Mike Matrigali added a comment -

          Without some sort of reproducible case, don't think this issue will be addressed. Also always include all the
          stack trace information from derby.log whenever possible.

          My best guess is that somehow a thread interrupt is being sent to the thread issuing the I/O, possibly the interrupt
          is even being posted to the thread before the execute statement is being called?

          To answer the question, the I/O described above could be issued either by the thread doing the insert , or it could
          be issued by a background thread executing a checkpoint. The stack track would tell which.

          Show
          Mike Matrigali added a comment - Without some sort of reproducible case, don't think this issue will be addressed. Also always include all the stack trace information from derby.log whenever possible. My best guess is that somehow a thread interrupt is being sent to the thread issuing the I/O, possibly the interrupt is even being posted to the thread before the execute statement is being called? To answer the question, the I/O described above could be issued either by the thread doing the insert , or it could be issued by a background thread executing a checkpoint. The stack track would tell which.
          Hide
          Jörg von Frantzius added a comment -

          I'm seeing the same, on Windows XP (Derby 10.1.1.0). Here's my stacktrace:

          [main][14.09.2005 16:37:36,115] DEBUG Request.java:66 - INSERT INTO KUENSTLER (LEBENSDATEN,LITERATUR,GEBURTS_ORT,KUENSTLER_NAME,LEGACY_ID,"NAME",GEBURTSDATUM,LAST_MODIFICATION,ERGAENZUNGEN,ADRESSE,GESCHLECHT,STERBEDATUM,VORNAME,BIOGRAPHIE,ID) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)
          ERROR XSDG1: Page Page(7,Container(0, 1088)) could not be written to disk, please check if disk is full.
          at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
          at org.apache.derby.impl.store.raw.data.CachedPage.writePage(Unknown Source)
          at org.apache.derby.impl.store.raw.data.CachedPage.createIdentity(Unknown Source)
          at org.apache.derby.impl.services.cache.CachedItem.takeOnIdentity(Unknown Source)
          at org.apache.derby.impl.services.cache.Clock.addEntry(Unknown Source)
          at org.apache.derby.impl.services.cache.Clock.create(Unknown Source)
          at org.apache.derby.impl.store.raw.data.FileContainer.initPage(Unknown Source)
          at org.apache.derby.impl.store.raw.data.FileContainer.newPage(Unknown Source)
          at org.apache.derby.impl.store.raw.data.BaseContainer.addPage(Unknown Source)
          at org.apache.derby.impl.store.raw.data.BaseContainerHandle.addPage(Unknown Source)
          at org.apache.derby.impl.store.access.heap.HeapController.doInsert(Unknown Source)
          at org.apache.derby.impl.store.access.heap.HeapController.insertAndFetchLocation(Unknown Source)
          at org.apache.derby.impl.sql.execute.RowChangerImpl.insertRow(Unknown Source)
          at org.apache.derby.impl.sql.execute.InsertResultSet.normalInsertCore(Unknown Source)
          at org.apache.derby.impl.sql.execute.InsertResultSet.open(Unknown Source)
          at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown Source)
          at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source)
          at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown Source)
          at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(Unknown Source)

          Show
          Jörg von Frantzius added a comment - I'm seeing the same, on Windows XP (Derby 10.1.1.0). Here's my stacktrace: [main] [14.09.2005 16:37:36,115] DEBUG Request.java:66 - INSERT INTO KUENSTLER (LEBENSDATEN,LITERATUR,GEBURTS_ORT,KUENSTLER_NAME,LEGACY_ID,"NAME",GEBURTSDATUM,LAST_MODIFICATION,ERGAENZUNGEN,ADRESSE,GESCHLECHT,STERBEDATUM,VORNAME,BIOGRAPHIE,ID) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?) ERROR XSDG1: Page Page(7,Container(0, 1088)) could not be written to disk, please check if disk is full. at org.apache.derby.iapi.error.StandardException.newException(Unknown Source) at org.apache.derby.impl.store.raw.data.CachedPage.writePage(Unknown Source) at org.apache.derby.impl.store.raw.data.CachedPage.createIdentity(Unknown Source) at org.apache.derby.impl.services.cache.CachedItem.takeOnIdentity(Unknown Source) at org.apache.derby.impl.services.cache.Clock.addEntry(Unknown Source) at org.apache.derby.impl.services.cache.Clock.create(Unknown Source) at org.apache.derby.impl.store.raw.data.FileContainer.initPage(Unknown Source) at org.apache.derby.impl.store.raw.data.FileContainer.newPage(Unknown Source) at org.apache.derby.impl.store.raw.data.BaseContainer.addPage(Unknown Source) at org.apache.derby.impl.store.raw.data.BaseContainerHandle.addPage(Unknown Source) at org.apache.derby.impl.store.access.heap.HeapController.doInsert(Unknown Source) at org.apache.derby.impl.store.access.heap.HeapController.insertAndFetchLocation(Unknown Source) at org.apache.derby.impl.sql.execute.RowChangerImpl.insertRow(Unknown Source) at org.apache.derby.impl.sql.execute.InsertResultSet.normalInsertCore(Unknown Source) at org.apache.derby.impl.sql.execute.InsertResultSet.open(Unknown Source) at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown Source) at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source) at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown Source) at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(Unknown Source)
          Hide
          Jörg von Frantzius added a comment -

          I might add that the disk is not full, and in contrast to the original poster, it doesn't seem to be a threading issue. As the stacktrace shows, it is the thread that issued the UPDATE statement that gets the exception.

          I've got the following non-default page cache settings:

          derby.storage.pageCacheSize = 320
          derby.storage.pageSize = 16384

          Show
          Jörg von Frantzius added a comment - I might add that the disk is not full, and in contrast to the original poster, it doesn't seem to be a threading issue. As the stacktrace shows, it is the thread that issued the UPDATE statement that gets the exception. I've got the following non-default page cache settings: derby.storage.pageCacheSize = 320 derby.storage.pageSize = 16384
          Hide
          Suresh Thalamati added a comment -

          derby.log occasional has more information than the above stack mentioned. In most cases , some sort of IO exception has
          to happen for the above error to occur, all the exception are nested inside the SQL Exception. If the derby.log file is still around ,
          please attach it to the jira.

          Show
          Suresh Thalamati added a comment - derby.log occasional has more information than the above stack mentioned. In most cases , some sort of IO exception has to happen for the above error to occur, all the exception are nested inside the SQL Exception. If the derby.log file is still around , please attach it to the jira.
          Hide
          Jörg von Frantzius added a comment -

          Alright, I didn't think of that! Here's the log (I think the last shutdown can be ignored).

          Show
          Jörg von Frantzius added a comment - Alright, I didn't think of that! Here's the log (I think the last shutdown can be ignored).
          Hide
          Jörg von Frantzius added a comment -

          Sorry, that was the wrong log! Could someone please delete it? It seems I can't.
          Here's the correct one (derby.log)

          Show
          Jörg von Frantzius added a comment - Sorry, that was the wrong log! Could someone please delete it? It seems I can't. Here's the correct one (derby.log)
          Hide
          Suresh Thalamati added a comment -

          Real cause for the error(ERROR XSDG1: Page Page(7,Container(0, 1088)) could not be written to disk, please check if disk is full. ) is :
          java.nio.channels.ClosedByInterruptException
          at java.nio.channels.spi.AbstractInterruptibleChannel.end(Unknown Source)
          at sun.nio.ch.FileChannelImpl.force(Unknown Source)
          at org.apache.derby.impl.io.DirRandomAccessFile4.sync(Unknown Source)
          at org.apache.derby.impl.store.raw.data.RAFContainer.writePage(Unknown Source)
          at org.apache.derby.impl.store.raw.data.CachedPage.writePage(Unknown Source)
          ............

          file sync is being interrupted , because user is interrupting the thread that is doing the database activity.
          May the Interrupt exception should be caught .. and reissue the sync if possible and also generate better error
          if it can not be synced even on a second attempt . Discussion about transaction log on the user list:
          http://www.mail-archive.com/derby-user@db.apache.org/msg01577.html

          Show
          Suresh Thalamati added a comment - Real cause for the error(ERROR XSDG1: Page Page(7,Container(0, 1088)) could not be written to disk, please check if disk is full. ) is : java.nio.channels.ClosedByInterruptException at java.nio.channels.spi.AbstractInterruptibleChannel.end(Unknown Source) at sun.nio.ch.FileChannelImpl.force(Unknown Source) at org.apache.derby.impl.io.DirRandomAccessFile4.sync(Unknown Source) at org.apache.derby.impl.store.raw.data.RAFContainer.writePage(Unknown Source) at org.apache.derby.impl.store.raw.data.CachedPage.writePage(Unknown Source) ............ file sync is being interrupted , because user is interrupting the thread that is doing the database activity. May the Interrupt exception should be caught .. and reissue the sync if possible and also generate better error if it can not be synced even on a second attempt . Discussion about transaction log on the user list: http://www.mail-archive.com/derby-user@db.apache.org/msg01577.html
          Hide
          Jörg von Frantzius added a comment -

          Trying to catch ClosedByInterruptException in RAFContainer would give a dependency on Java 1.4. Not sure whether that is wanted, and I'm having a bit of a hard time to get build.xml compile it (there is 1.4 classes being used in impl.io without problems?!)

          Show
          Jörg von Frantzius added a comment - Trying to catch ClosedByInterruptException in RAFContainer would give a dependency on Java 1.4. Not sure whether that is wanted, and I'm having a bit of a hard time to get build.xml compile it (there is 1.4 classes being used in impl.io without problems?!)
          Hide
          Jörg von Frantzius added a comment -

          For any Eclipse-plugin, Swing or GUI programmers in general seeing this bug: writing to the embedded DB from the UI thread is a bad idea, as that thread gets interrupted very frequently. Moving write operations into their own thread solved the problem for stupid me.
          Cheers!

          Show
          Jörg von Frantzius added a comment - For any Eclipse-plugin, Swing or GUI programmers in general seeing this bug: writing to the embedded DB from the UI thread is a bad idea, as that thread gets interrupted very frequently. Moving write operations into their own thread solved the problem for stupid me. Cheers!
          Hide
          Kathey Marsden added a comment -

          I was wondering if anyone could provide some expanded information on this issue. I am really not all that familiar with running Derby in eclipse and have
          some probably very basic questions which are:

          1) I was wondering if anyone had some information on accessing Derby
          from within eclipse and avoiding ClosedByInterruptException as
          described in DERBY-151. Do you always need to launch a separate thread for any Derby access when running in eclipse?

          2) If the answer to #1 is "Yes", is there a Derby solution to this problem that is possible or is it strictly a user issue, for example, if the retry was implemented would that eliminate the requirement to access Derby in a separate thread?

          3) Are there any similar issues with interuption when accessing with the client driver?

          4) Is using embedded server ok with eclipse? The plugins doc seems to
          imply that you should always use network server and access with Derby
          client.
          http://db.apache.org/derby/integrate/plugin_help/start_toc.html

          Thanks!

          Show
          Kathey Marsden added a comment - I was wondering if anyone could provide some expanded information on this issue. I am really not all that familiar with running Derby in eclipse and have some probably very basic questions which are: 1) I was wondering if anyone had some information on accessing Derby from within eclipse and avoiding ClosedByInterruptException as described in DERBY-151 . Do you always need to launch a separate thread for any Derby access when running in eclipse? 2) If the answer to #1 is "Yes", is there a Derby solution to this problem that is possible or is it strictly a user issue, for example, if the retry was implemented would that eliminate the requirement to access Derby in a separate thread? 3) Are there any similar issues with interuption when accessing with the client driver? 4) Is using embedded server ok with eclipse? The plugins doc seems to imply that you should always use network server and access with Derby client. http://db.apache.org/derby/integrate/plugin_help/start_toc.html Thanks!
          Hide
          Kathey Marsden added a comment -

          I am thinking of closing this issue CannotReproduce. Please let me know if you have any objections.
          It can be reopened if we get a repro.

          Show
          Kathey Marsden added a comment - I am thinking of closing this issue CannotReproduce. Please let me know if you have any objections. It can be reopened if we get a repro.
          Hide
          Knut Anders Hatlen added a comment -

          Attached is a repro.

          The good news is that from 10.3 the java.nio.channels.ClosedByInterruptException is linked to the SQLException, so printStackTrace() will show that it's the underlying cause. But the SQLException still says "please check if disk is full".

          Another piece of good news: In an earlier comment, Suresh suggested that we could catch ClosedByInterruptException and use that to reissue the sync() and/or give a better error message. The suggestion was rejected because it would introduce a dependency on Java 1.4 in RAFContainer. From 10.3 we have RAFContainer4 which is depending on Java 1.4, so that shouldn't stop us anymore.

          Show
          Knut Anders Hatlen added a comment - Attached is a repro. The good news is that from 10.3 the java.nio.channels.ClosedByInterruptException is linked to the SQLException, so printStackTrace() will show that it's the underlying cause. But the SQLException still says "please check if disk is full". Another piece of good news: In an earlier comment, Suresh suggested that we could catch ClosedByInterruptException and use that to reissue the sync() and/or give a better error message. The suggestion was rejected because it would introduce a dependency on Java 1.4 in RAFContainer. From 10.3 we have RAFContainer4 which is depending on Java 1.4, so that shouldn't stop us anymore.
          Hide
          Paul Sabatino added a comment -

          We ran into this same error when running our application over slow network connections. At first we didn't see the connection between the network and the derby error until we started to investigate interrupts. We found that if a thread was ever interrupted and you attempt a commit on that same thread you'll see this error. It seems to be linked to having the threads interrupted flag set.

          Here's how to test it:
          1. add a Thread.interrupted(); right after the interrupt() in the attached test program d151.java. (this clears the interrupted flag)
          2. Make sure the DB directory is deleted and rerun the test. It will now complete.

          I managed to fix our application by adjusting our network layer to clear the interrupted flag when we timed out an IO operation. That was how our issue was related to a slow network and not a local disk problem.

          The error message is misleading since it seems to indicate that there's a problem writing to the file, when really it just misinterpreted the interrupted flag to mean the channel was closed. You might want to fix how the interrupts are checked.

          Show
          Paul Sabatino added a comment - We ran into this same error when running our application over slow network connections. At first we didn't see the connection between the network and the derby error until we started to investigate interrupts. We found that if a thread was ever interrupted and you attempt a commit on that same thread you'll see this error. It seems to be linked to having the threads interrupted flag set. Here's how to test it: 1. add a Thread.interrupted(); right after the interrupt() in the attached test program d151.java. (this clears the interrupted flag) 2. Make sure the DB directory is deleted and rerun the test. It will now complete. I managed to fix our application by adjusting our network layer to clear the interrupted flag when we timed out an IO operation. That was how our issue was related to a slow network and not a local disk problem. The error message is misleading since it seems to indicate that there's a problem writing to the file, when really it just misinterpreted the interrupted flag to mean the channel was closed. You might want to fix how the interrupts are checked.
          Hide
          Paul Sabatino added a comment -

          I thought I'd add a little more clarification to this. This error seems to only come up when the interrupt flag is set on a thread that is doing a commit that which causes a page to be written to disk. I figured this out by modifying the d151.java to take pageSize as an argument and to count how many iterations of insert and interrupt it took to fail. I then varied the pageSize and noted that the number of iterations before failure was proportional to the pageSize. So my guess was that the page was being written somewhere under the covers and causing this error.

          I've attached my test program. I hope this helps resolve this issue.

          Show
          Paul Sabatino added a comment - I thought I'd add a little more clarification to this. This error seems to only come up when the interrupt flag is set on a thread that is doing a commit that which causes a page to be written to disk. I figured this out by modifying the d151.java to take pageSize as an argument and to count how many iterations of insert and interrupt it took to fail. I then varied the pageSize and noted that the number of iterations before failure was proportional to the pageSize. So my guess was that the page was being written somewhere under the covers and causing this error. I've attached my test program. I hope this helps resolve this issue.
          Hide
          Trejkaz added a comment -

          I've been wrestling with this issue too. I can't reproduce it reliably in the live system but it happens often enough to be an annoyance, and our unit tests cause it to occur much more often.

          In our case it's occurring on one of our own threads so moving database work off the EDT can't help.

          Closing and reopening the connection doesn't make it work again, we have to do a shutdown and start it again, and then it appears to work (until the problem occurs again.) When running the real application when there is a connection pool in effect, this becomes very hard to do.

          Show
          Trejkaz added a comment - I've been wrestling with this issue too. I can't reproduce it reliably in the live system but it happens often enough to be an annoyance, and our unit tests cause it to occur much more often. In our case it's occurring on one of our own threads so moving database work off the EDT can't help. Closing and reopening the connection doesn't make it work again, we have to do a shutdown and start it again, and then it appears to work (until the problem occurs again.) When running the real application when there is a connection pool in effect, this becomes very hard to do.
          Hide
          Dag H. Wanvik added a comment -

          Running Knut's repro with Java 1.5 or 1.5, I see this error:

          Java exception: 'org/apache/derby/impl/sql/execute/NormalizeResultSet: java.lang.NoClassDefFoundError'.
          Exception in thread "main" java.sql.SQLException: No current connection.
          at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45)
          at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:142)
          at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:158)
          at org.apache.derby.impl.jdbc.Util.noCurrentConnection(Util.java:262)
          at org.apache.derby.impl.jdbc.EmbedConnection.checkIfClosed(EmbedConnection.java:2179)
          at org.apache.derby.impl.jdbc.EmbedConnection.createStatement(EmbedConnection.java:1334)
          at org.apache.derby.impl.jdbc.EmbedConnection.createStatement(EmbedConnection.java:1291)
          at d151.main(d151.java:18)

          which seems to be our old friend; the classloader being sensitive to the interrupted flag in the thread. With 1.6
          I see the disk is full message, though.

          Show
          Dag H. Wanvik added a comment - Running Knut's repro with Java 1.5 or 1.5, I see this error: Java exception: 'org/apache/derby/impl/sql/execute/NormalizeResultSet: java.lang.NoClassDefFoundError'. Exception in thread "main" java.sql.SQLException: No current connection. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45) at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:142) at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:158) at org.apache.derby.impl.jdbc.Util.noCurrentConnection(Util.java:262) at org.apache.derby.impl.jdbc.EmbedConnection.checkIfClosed(EmbedConnection.java:2179) at org.apache.derby.impl.jdbc.EmbedConnection.createStatement(EmbedConnection.java:1334) at org.apache.derby.impl.jdbc.EmbedConnection.createStatement(EmbedConnection.java:1291) at d151.main(d151.java:18) which seems to be our old friend; the classloader being sensitive to the interrupted flag in the thread. With 1.6 I see the disk is full message, though.
          Hide
          Dag H. Wanvik added a comment -

          My observation is really unrelated to the RAFContainer4 situation; if I move the call of Thread.currentThread().interrupt() in d151.java
          to after the first executeUpdate, this doesn't happen, because all relevant classes have been read by the classloader.
          See DERBY-1338 for details.

          So really it's just another manifestation of the fact that interrupting a Derby thread is not really healthy.
          I think we should just catch the ClosedByInterruptException and create a more explanatory error message, e.g maybe thus:

          Exception in thread "main" java.sql.SQLException: Derby thread received an interrupt during a write operation, please check your application.
          at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45)
          at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:142)
          at org.apache.derby.impl.jdbc.Util.seeNextException(Util.java:278)
          at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:398)
          at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(TransactionResourceImpl.java:346)
          at org.apache.derby.impl.jdbc.EmbedConnection.handleException(EmbedConnection.java:2204)
          at org.apache.derby.impl.jdbc.ConnectionChild.handleException(ConnectionChild.java:81)
          at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1323)
          at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(EmbedPreparedStatement.java:1675)
          at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(EmbedPreparedStatement.java:302)
          at d151.main(d151.java:18)
          Caused by: java.sql.SQLException: Java exception: ': java.nio.channels.ClosedByInterruptException'.
          at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45)
          at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:142)
          at org.apache.derby.impl.jdbc.Util.javaException(Util.java:299)
          at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:403)
          ... 8 more
          Caused by: java.nio.channels.ClosedByInterruptException
          at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
          at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:656)
          at org.apache.derby.impl.store.raw.data.RAFContainer4.writeFull(RAFContainer4.java:496)
          at org.apache.derby.impl.store.raw.data.RAFContainer4.writePage0(RAFContainer4.java:346)
          at org.apache.derby.impl.store.raw.data.RAFContainer4.writePage(RAFContainer4.java:291)
          at org.apache.derby.impl.store.raw.data.CachedPage.writePage(CachedPage.java:787)
          at org.apache.derby.impl.store.raw.data.CachedPage.createIdentity(CachedPage.java:305)
          at org.apache.derby.impl.services.cache.ConcurrentCache.create(ConcurrentCache.java:388)
          at org.apache.derby.impl.store.raw.data.FileContainer.initPage(FileContainer.java:2336)
          at org.apache.derby.impl.store.raw.data.FileContainer.newPage(FileContainer.java:1800)
          at org.apache.derby.impl.store.raw.data.BaseContainer.addPage(BaseContainer.java:314)
          at org.apache.derby.impl.store.raw.data.BaseContainerHandle.addPage(BaseContainerHandle.java:183)
          at org.apache.derby.impl.store.access.heap.HeapController.doInsert(HeapController.java:302)
          at org.apache.derby.impl.store.access.heap.HeapController.insertAndFetchLocation(HeapController.java:599)
          at org.apache.derby.impl.sql.execute.RowChangerImpl.insertRow(RowChangerImpl.java:452)
          at org.apache.derby.impl.sql.execute.InsertResultSet.normalInsertCore(InsertResultSet.java:1022)
          at org.apache.derby.impl.sql.execute.InsertResultSet.open(InsertResultSet.java:495)
          at org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(GenericPreparedStatement.java:416)
          at org.apache.derby.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java:297)
          at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1235)

          Show
          Dag H. Wanvik added a comment - My observation is really unrelated to the RAFContainer4 situation; if I move the call of Thread.currentThread().interrupt() in d151.java to after the first executeUpdate, this doesn't happen, because all relevant classes have been read by the classloader. See DERBY-1338 for details. So really it's just another manifestation of the fact that interrupting a Derby thread is not really healthy. I think we should just catch the ClosedByInterruptException and create a more explanatory error message, e.g maybe thus: Exception in thread "main" java.sql.SQLException: Derby thread received an interrupt during a write operation, please check your application. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45) at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:142) at org.apache.derby.impl.jdbc.Util.seeNextException(Util.java:278) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:398) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(TransactionResourceImpl.java:346) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(EmbedConnection.java:2204) at org.apache.derby.impl.jdbc.ConnectionChild.handleException(ConnectionChild.java:81) at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1323) at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(EmbedPreparedStatement.java:1675) at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(EmbedPreparedStatement.java:302) at d151.main(d151.java:18) Caused by: java.sql.SQLException: Java exception: ': java.nio.channels.ClosedByInterruptException'. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45) at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:142) at org.apache.derby.impl.jdbc.Util.javaException(Util.java:299) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:403) ... 8 more Caused by: java.nio.channels.ClosedByInterruptException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184) at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:656) at org.apache.derby.impl.store.raw.data.RAFContainer4.writeFull(RAFContainer4.java:496) at org.apache.derby.impl.store.raw.data.RAFContainer4.writePage0(RAFContainer4.java:346) at org.apache.derby.impl.store.raw.data.RAFContainer4.writePage(RAFContainer4.java:291) at org.apache.derby.impl.store.raw.data.CachedPage.writePage(CachedPage.java:787) at org.apache.derby.impl.store.raw.data.CachedPage.createIdentity(CachedPage.java:305) at org.apache.derby.impl.services.cache.ConcurrentCache.create(ConcurrentCache.java:388) at org.apache.derby.impl.store.raw.data.FileContainer.initPage(FileContainer.java:2336) at org.apache.derby.impl.store.raw.data.FileContainer.newPage(FileContainer.java:1800) at org.apache.derby.impl.store.raw.data.BaseContainer.addPage(BaseContainer.java:314) at org.apache.derby.impl.store.raw.data.BaseContainerHandle.addPage(BaseContainerHandle.java:183) at org.apache.derby.impl.store.access.heap.HeapController.doInsert(HeapController.java:302) at org.apache.derby.impl.store.access.heap.HeapController.insertAndFetchLocation(HeapController.java:599) at org.apache.derby.impl.sql.execute.RowChangerImpl.insertRow(RowChangerImpl.java:452) at org.apache.derby.impl.sql.execute.InsertResultSet.normalInsertCore(InsertResultSet.java:1022) at org.apache.derby.impl.sql.execute.InsertResultSet.open(InsertResultSet.java:495) at org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(GenericPreparedStatement.java:416) at org.apache.derby.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java:297) at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1235)
          Hide
          Dag H. Wanvik added a comment -

          Uploading a patch that catches the ClosedByInterruptException and
          throws a more meaningful error message as suggested above.
          I added a test to the store regression suite and ran it successfully on Windows Vista and OpenSolaris.

          Show
          Dag H. Wanvik added a comment - Uploading a patch that catches the ClosedByInterruptException and throws a more meaningful error message as suggested above. I added a test to the store regression suite and ran it successfully on Windows Vista and OpenSolaris.
          Hide
          Knut Anders Hatlen added a comment -

          The patch looks good to me. Perhaps "please check your application" is a bit vague? What about "please check your application for the source of the interrupt"?

          There may be other I/O calls that are sensitive to interrupts too, but I guess we can address them with a similar fix when they surface.

          Some minor issues in the test:

          • Class name in header is wrong (jdbc4 -> store)
          • getConnection().createStatement() and getConnection().prepareStatement() could be replaced by calls to helper methods in BaseJDBCTestCase
          Show
          Knut Anders Hatlen added a comment - The patch looks good to me. Perhaps "please check your application" is a bit vague? What about "please check your application for the source of the interrupt"? There may be other I/O calls that are sensitive to interrupts too, but I guess we can address them with a similar fix when they surface. Some minor issues in the test: Class name in header is wrong (jdbc4 -> store) getConnection().createStatement() and getConnection().prepareStatement() could be replaced by calls to helper methods in BaseJDBCTestCase
          Hide
          Dag H. Wanvik added a comment - - edited

          Uploading version "b" of this patch.

          • Incorporated Knut's test comments.
          • Changed severity to database; updated ErrorCodeTest to reflect this.
            Rationale: It seems right - I don't know that it's safe to continue
            after this kind of error, and I compared to other IO level errors in
            raw store and mostly seem to have database severity.
          • Made the new Derby151Test not fail even if expected error is not
            seen. I did this to allow for possible different behavior on other
            VMs, JUnit verbose mode would instead print "Not able to test fix
            for DERBY-151: No interrupt seen".
          • I added the same handling to read (even if not seen yet) as for
            write. Regressions passed modulo ErrorCodeTest, which i missed
            first time around.
          • Fixed some broken Javadocs.

          Ready for review.

          Show
          Dag H. Wanvik added a comment - - edited Uploading version "b" of this patch. Incorporated Knut's test comments. Changed severity to database; updated ErrorCodeTest to reflect this. Rationale: It seems right - I don't know that it's safe to continue after this kind of error, and I compared to other IO level errors in raw store and mostly seem to have database severity. Made the new Derby151Test not fail even if expected error is not seen. I did this to allow for possible different behavior on other VMs, JUnit verbose mode would instead print "Not able to test fix for DERBY-151 : No interrupt seen". I added the same handling to read (even if not seen yet) as for write. Regressions passed modulo ErrorCodeTest, which i missed first time around. Fixed some broken Javadocs. Ready for review.
          Hide
          Dag H. Wanvik added a comment -

          Picking this up again; I haven't received any more comments on this, and it seems a net improvement
          to me even if it may not be a panacea. I intend to commit this soon, unless anybody has any misgivings.

          Show
          Dag H. Wanvik added a comment - Picking this up again; I haven't received any more comments on this, and it seems a net improvement to me even if it may not be a panacea. I intend to commit this soon, unless anybody has any misgivings.
          Hide
          Dag H. Wanvik added a comment -

          Hmm, seems there is a problem with patch version b, changing the
          severity to database makes the database try to shut down, which
          doesn't work as long as the interrupt flag is set, I guess, since shutting down attempts more writing operations.

          Show
          Dag H. Wanvik added a comment - Hmm, seems there is a problem with patch version b, changing the severity to database makes the database try to shut down, which doesn't work as long as the interrupt flag is set, I guess, since shutting down attempts more writing operations.
          Hide
          Dag H. Wanvik added a comment - - edited

          Luckily this turned out to be a red herring, the shutting down does see errors since the channel
          has been closed by the interrupt, but the correct database level exception is returned to the app (XSDG9), and
          rebooting+recovery works as expected.

          This will be seen on derby.log when the shutdown sees the closed channel:
          ERROR XSDG9: Derby thread received an interrupt during a disk I/O operation, please check your application for the source of the interrupt.
          :
          ------------ BEGIN SHUTDOWN ERROR STACK -------------

          ERROR XSDG1: Page Page(1,Container(0, 1120)) could not be written to disk, please check if disk is full.
          :
          Caused by: java.nio.channels.ClosedChannelException

          Uploading derby-151-c, which fixes a bug in the test; it asserted for the wrong SQL state. However, assertSQLState is
          a bit broken; if it can't match on the top level SQL state, it goes down the chain to see if it can match a wrapped exception.
          When that fails, it gives up and reports that it found the innermost wrapped exception, whereas the application saw the outermost one,
          causing me to suspect the code fix was wrong.

          Running regressions.

          Show
          Dag H. Wanvik added a comment - - edited Luckily this turned out to be a red herring, the shutting down does see errors since the channel has been closed by the interrupt, but the correct database level exception is returned to the app (XSDG9), and rebooting+recovery works as expected. This will be seen on derby.log when the shutdown sees the closed channel: ERROR XSDG9: Derby thread received an interrupt during a disk I/O operation, please check your application for the source of the interrupt. : ------------ BEGIN SHUTDOWN ERROR STACK ------------- ERROR XSDG1: Page Page(1,Container(0, 1120)) could not be written to disk, please check if disk is full. : Caused by: java.nio.channels.ClosedChannelException Uploading derby-151-c, which fixes a bug in the test; it asserted for the wrong SQL state. However, assertSQLState is a bit broken; if it can't match on the top level SQL state, it goes down the chain to see if it can match a wrapped exception. When that fails, it gives up and reports that it found the innermost wrapped exception, whereas the application saw the outermost one, causing me to suspect the code fix was wrong. Running regressions.
          Hide
          Dag H. Wanvik added a comment -

          Committed patch derby-151-c as svn 886831.

          Show
          Dag H. Wanvik added a comment - Committed patch derby-151-c as svn 886831.
          Hide
          Dag H. Wanvik added a comment -

          Uploading an improved test (derby-151-followup.diff), which also tests using the client driver.

          Show
          Dag H. Wanvik added a comment - Uploading an improved test (derby-151-followup.diff), which also tests using the client driver.
          Hide
          Dag H. Wanvik added a comment -

          Committed derby-151-followup as svn 886963, resolving.

          Show
          Dag H. Wanvik added a comment - Committed derby-151-followup as svn 886963, resolving.

            People

            • Assignee:
              Dag H. Wanvik
              Reporter:
              Barnet Wagman
            • Votes:
              5 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development