Derby
  1. Derby
  2. DERBY-4129

jvm level Deadlock between 2 threads accessing the same connection

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 10.4.2.0
    • Fix Version/s: None
    • Component/s: JDBC, SQL, Store
    • Environment:
      Linux 2.6.18-53.1.14.el5 #1 SMP Wed Mar 5 11:36:49 EST 2008 i686 i686 i386 GNU/Linux
    • Urgency:
      Normal
    • Bug behavior facts:
      Crash, Seen in production

      Description

      Threads working with embedded derby get deadlocked.
      Thread dump:

      Found one Java-level deadlock:
      =============================
      "http-8080-13":
      waiting to lock monitor 0x09d1a37c (object 0x9560fcd8, a org.apache.derby.impl.store.raw.data.BaseContainerHandle),
      which is held by "dwc.ma.MAJobsManager:dwc.ma.MAManager"
      "dwc.ma.MAJobsManager:dwc.ma.MAManager":
      waiting to lock monitor 0x09d1a57c (object 0x93da0180, a org.apache.derby.impl.store.raw.data.StoredPage),
      which is held by "http-8080-13"

      Java stack information for the threads listed above:
      ===================================================
      "http-8080-13":
      at java.util.Observable.deleteObserver(Observable.java:78)

      • waiting to lock <0x9560fcd8> (a org.apache.derby.impl.store.raw.data.BaseContainerHandle)
        at org.apache.derby.impl.store.raw.data.BasePage.releaseExclusive(Unknown Source)
      • locked <0x93da0180> (a org.apache.derby.impl.store.raw.data.StoredPage)
        at org.apache.derby.impl.store.raw.data.CachedPage.releaseExclusive(Unknown Source)
        at org.apache.derby.impl.store.raw.data.StoredPage.releaseExclusive(Unknown Source)
        at org.apache.derby.impl.store.raw.data.BasePage.unlatch(Unknown Source)
        at org.apache.derby.impl.store.raw.data.OverflowInputStream.fillByteHolder(Unknown Source)
        at org.apache.derby.impl.store.raw.data.BufferedByteHolderInputStream.read(Unknown Source)
        at java.io.DataInputStream.readFully(DataInputStream.java:176)
        at java.io.DataInputStream.readFully(DataInputStream.java:152)
        at org.apache.derby.iapi.types.SQLBinary.readExternal(Unknown Source)
        at org.apache.derby.iapi.types.SQLBinary.getValue(Unknown Source)
        at org.apache.derby.iapi.types.SQLBinary.getBytes(Unknown Source)
        at org.apache.derby.impl.jdbc.EmbedResultSet.getBytes(Unknown Source)
        at dwc.ma.MAOrderProcessor.loadData(MAOrderProcessor.java:147)
      • locked <0x91f211b0> (a java.lang.Object)
      • locked <0x91f6a068> (a dwc.ma.MAOrderProcessor)
        at dwc.ma.MAOrderProcessor.generateBundlesForUser(MAOrderProcessor.java:538)
      • locked <0x91f6a068> (a dwc.ma.MAOrderProcessor)
        at dwc.ma.MARequestProcessor$HBundleRequest.process(MARequestProcessor.java:239)
        at dwc.ma.MARequestProcessor.processRequest(MARequestProcessor.java:39)
        at dwc.ma.doProc.doPost(doProc.java:92)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:637)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.tuckey.web.filters.urlrewrite.UrlRewriteFilter.doFilter(UrlRewriteFilter.java:738)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
        at java.lang.Thread.run(Thread.java:595)
        "dwc.ma.MAJobsManager:dwc.ma.MAManager":
        at org.apache.derby.impl.store.raw.data.BasePage.releaseExclusive(Unknown Source)
      • waiting to lock <0x93da0180> (a org.apache.derby.impl.store.raw.data.StoredPage)
        at org.apache.derby.impl.store.raw.data.CachedPage.releaseExclusive(Unknown Source)
        at org.apache.derby.impl.store.raw.data.StoredPage.releaseExclusive(Unknown Source)
        at org.apache.derby.impl.store.raw.data.BasePage.update(Unknown Source)
        at java.util.Observable.notifyObservers(Observable.java:142)
        at java.util.Observable.notifyObservers(Observable.java:98)
        at org.apache.derby.impl.store.raw.data.BaseContainerHandle.informObservers(Unknown Source)
        at org.apache.derby.impl.store.raw.data.BaseContainerHandle.close(Unknown Source)
      • locked <0x9560fcd8> (a org.apache.derby.impl.store.raw.data.BaseContainerHandle)
        at org.apache.derby.impl.store.access.conglomerate.OpenConglomerate.close(Unknown Source)
        at org.apache.derby.impl.store.access.conglomerate.GenericController.close(Unknown Source)
        at org.apache.derby.impl.store.access.conglomerate.GenericConglomerateController.closeForEndTransaction(Unknown Source)
        at org.apache.derby.impl.store.access.RAMTransaction.closeControllers(Unknown Source)
        at org.apache.derby.impl.store.access.RAMTransaction.commit(Unknown Source)
        at org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.doCommit(Unknown Source)
        at org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.userCommit(Unknown Source)
        at org.apache.derby.impl.jdbc.TransactionResourceImpl.commit(Unknown Source)
        at org.apache.derby.impl.jdbc.EmbedConnection.commitIfAutoCommit(Unknown Source)
        at org.apache.derby.impl.jdbc.ConnectionChild.commitIfAutoCommit(Unknown Source)
        at org.apache.derby.impl.jdbc.EmbedStatement.resultSetClosing(Unknown Source)
        at org.apache.derby.impl.jdbc.EmbedResultSet.close(Unknown Source)
      • locked <0x91f0b7d0> (a org.apache.derby.impl.jdbc.EmbedConnection30)
        at dwc.ma.MAStatsProcessor$UserStatsBundlesProc.run(MAStatsProcessor.java:117)
        at dwc.ma.MAJobsManager$Runner.run(MAJobsManager.java:84)
      • locked <0x91fce338> (a java.util.LinkedList)
        at java.lang.Thread.run(Thread.java:595)

      Found 1 deadlock.

      1. threaddump.txt
        11 kB
        Knut Anders Hatlen
      2. BlobDeadlock.java
        2 kB
        Knut Anders Hatlen

        Issue Links

          Activity

          Hide
          Knut Anders Hatlen added a comment -

          I am able to reproduce this deadlock. See the attached files BlobDeadlock.java and threaddump.txt.

          Note that the repro has two threads performing queries using the same Connection object. As far as I can see, that's the only way two threads can end up accessing the same BaseContainerHandle, and thereby running into this deadlock. Having concurrent threads using the same Connection object is not recommended, but it should of course not cause a Java-level deadlock.

          Normally, deadlock between threads sharing the same Connection is prevented by only allowing one thread at a time to enter the engine through a single Connection. For instance, EmbedResultSet.next() and EmbedResultSet.close() will synchronize on the parent EmbedConnection object before calling the engine code. The getters don't synchronize on the EmbedConnection because they just return a value that was cached in the ResultSet when next() was called.

          The reason why this doesn't work as intended in this case, is that EmbedResultSet.getBytes() will have to go to the store to fetch the value if the BLOB column is larger than what EmbedResultSet.next() will cache (basically, it doesn't fit on one page). But since the getters don't synchronize on the EmbedConnection, we now enter the engine without making sure that no other thread is actively using the same connection. If getBytes() had synchronized on the connection, we would have been stopped from entering the engine much earlier because the other thread involved in the deadlock enters the engine from close(), where it does synchronize on the connection, and we wouldn't run into the deadlock.

          Show
          Knut Anders Hatlen added a comment - I am able to reproduce this deadlock. See the attached files BlobDeadlock.java and threaddump.txt. Note that the repro has two threads performing queries using the same Connection object. As far as I can see, that's the only way two threads can end up accessing the same BaseContainerHandle, and thereby running into this deadlock. Having concurrent threads using the same Connection object is not recommended, but it should of course not cause a Java-level deadlock. Normally, deadlock between threads sharing the same Connection is prevented by only allowing one thread at a time to enter the engine through a single Connection. For instance, EmbedResultSet.next() and EmbedResultSet.close() will synchronize on the parent EmbedConnection object before calling the engine code. The getters don't synchronize on the EmbedConnection because they just return a value that was cached in the ResultSet when next() was called. The reason why this doesn't work as intended in this case, is that EmbedResultSet.getBytes() will have to go to the store to fetch the value if the BLOB column is larger than what EmbedResultSet.next() will cache (basically, it doesn't fit on one page). But since the getters don't synchronize on the EmbedConnection, we now enter the engine without making sure that no other thread is actively using the same connection. If getBytes() had synchronized on the connection, we would have been stopped from entering the engine much earlier because the other thread involved in the deadlock enters the engine from close(), where it does synchronize on the connection, and we wouldn't run into the deadlock.
          Hide
          Kathey Marsden added a comment -

          It it supported to use a single connection simultaneously in multiple threads?

          Show
          Kathey Marsden added a comment - It it supported to use a single connection simultaneously in multiple threads?
          Hide
          Knut Anders Hatlen added a comment -

          We do have a lot of synchronization machinery to make it safe to access a connection simultaneously from different threads, so in that sense it's supported. But I would certainly not recommend it.

          Show
          Knut Anders Hatlen added a comment - We do have a lot of synchronization machinery to make it safe to access a connection simultaneously from different threads, so in that sense it's supported. But I would certainly not recommend it.
          Hide
          Rick Hillegas added a comment -

          Triaged July 2, 2009: I don't know whether this is really a crash. I don't know how to interpret the output of the attached test case. I am not seeing any stack traces or errors so I can't evaluate whether this deadlock crashes the application.

          Show
          Rick Hillegas added a comment - Triaged July 2, 2009: I don't know whether this is really a crash. I don't know how to interpret the output of the attached test case. I am not seeing any stack traces or errors so I can't evaluate whether this deadlock crashes the application.
          Hide
          Dag H. Wanvik added a comment - - edited

          Saw similar code (acessing store without synchronizing on rootConnection as most JDBC API methods do to be thread safe): cf. this stack taken from RAFCOntainer4 (while investigating DERBY-4741):

          Output trace from RAFContainer4:

          lcc=org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext@1e2befa

          java.lang.Exception: Stack trace
          at org.apache.derby.impl.store.raw.data.RAFContainer4.readFull(RAFContainer4.java:494)
          at org.apache.derby.impl.store.raw.data.RAFContainer4.readPage0(RAFContainer4.java:244)
          at org.apache.derby.impl.store.raw.data.RAFContainer4.readPage(RAFContainer4.java:214)
          at org.apache.derby.impl.store.raw.data.CachedPage.readPage(CachedPage.java:671)
          at org.apache.derby.impl.store.raw.data.CachedPage.setIdentity(CachedPage.java:190)
          at org.apache.derby.impl.services.cache.ConcurrentCache.find(ConcurrentCache.java:295)
          at org.apache.derby.impl.store.raw.data.FileContainer.getUserPage(FileContainer.java:2452)
          at org.apache.derby.impl.store.raw.data.FileContainer.getPage(FileContainer.java:2502)
          at org.apache.derby.impl.store.raw.data.BaseContainerHandle.getPage(BaseContainerHandle.java:319)
          at org.apache.derby.impl.store.raw.data.OverflowInputStream.fillByteHolder(OverflowInputStream.java:150)
          at org.apache.derby.impl.store.raw.data.BufferedByteHolderInputStream.read(BufferedByteHolderInputStream.java:37)
          at java.io.FilterInputStream.read(FilterInputStream.java:66)
          at org.apache.derby.impl.jdbc.PositionedStoreStream.read(PositionedStoreStream.java:131)
          at org.apache.derby.impl.jdbc.UTF8Reader.fillBuffer(UTF8Reader.java:415)
          at org.apache.derby.impl.jdbc.UTF8Reader.skip(UTF8Reader.java:232)
          at org.apache.derby.impl.jdbc.StoreStreamClob.getReader(StoreStreamClob.java:236)
          at org.apache.derby.impl.jdbc.ClobUpdatableReader.updateReaderIfRequired(ClobUpdatableReader.java:210)
          at org.apache.derby.impl.jdbc.ClobUpdatableReader.read(ClobUpdatableReader.java:120)
          at InterruptTestClob$WorkerThread.run(InterruptTestClob.java:171)

          I don't believe this stack holds the monitor on rootConnection.... since none of the JDBC methods seem grab it. I note that UTF8Reader#fillBuffer does call EmbedConnection#setupContextStack, cf the fact that lcc is set when retrieved via getContext from RAFContainer4.

          Additionally, the Clob code doesn't ever call ConnectionChild#handleException to clean up errors (close connection or shut down database) - EmbedClob does, at least in some cases. In the above case, this meant that an error thrown from RAFContainer4 (FILE_IO_INTERRUPTED, database level), seen wrapped as an IOException by the Clob#read call, did not lead to Derby shutting down until the next operation against the Connection was initiated and hit a problem.. Is this the way it is intended to be? I think the CLOB code should both synchronize and handle errors as the rest of the JDBC code does?

          Show
          Dag H. Wanvik added a comment - - edited Saw similar code (acessing store without synchronizing on rootConnection as most JDBC API methods do to be thread safe): cf. this stack taken from RAFCOntainer4 (while investigating DERBY-4741 ): Output trace from RAFContainer4: lcc=org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext@1e2befa java.lang.Exception: Stack trace at org.apache.derby.impl.store.raw.data.RAFContainer4.readFull(RAFContainer4.java:494) at org.apache.derby.impl.store.raw.data.RAFContainer4.readPage0(RAFContainer4.java:244) at org.apache.derby.impl.store.raw.data.RAFContainer4.readPage(RAFContainer4.java:214) at org.apache.derby.impl.store.raw.data.CachedPage.readPage(CachedPage.java:671) at org.apache.derby.impl.store.raw.data.CachedPage.setIdentity(CachedPage.java:190) at org.apache.derby.impl.services.cache.ConcurrentCache.find(ConcurrentCache.java:295) at org.apache.derby.impl.store.raw.data.FileContainer.getUserPage(FileContainer.java:2452) at org.apache.derby.impl.store.raw.data.FileContainer.getPage(FileContainer.java:2502) at org.apache.derby.impl.store.raw.data.BaseContainerHandle.getPage(BaseContainerHandle.java:319) at org.apache.derby.impl.store.raw.data.OverflowInputStream.fillByteHolder(OverflowInputStream.java:150) at org.apache.derby.impl.store.raw.data.BufferedByteHolderInputStream.read(BufferedByteHolderInputStream.java:37) at java.io.FilterInputStream.read(FilterInputStream.java:66) at org.apache.derby.impl.jdbc.PositionedStoreStream.read(PositionedStoreStream.java:131) at org.apache.derby.impl.jdbc.UTF8Reader.fillBuffer(UTF8Reader.java:415) at org.apache.derby.impl.jdbc.UTF8Reader.skip(UTF8Reader.java:232) at org.apache.derby.impl.jdbc.StoreStreamClob.getReader(StoreStreamClob.java:236) at org.apache.derby.impl.jdbc.ClobUpdatableReader.updateReaderIfRequired(ClobUpdatableReader.java:210) at org.apache.derby.impl.jdbc.ClobUpdatableReader.read(ClobUpdatableReader.java:120) at InterruptTestClob$WorkerThread.run(InterruptTestClob.java:171) I don't believe this stack holds the monitor on rootConnection.... since none of the JDBC methods seem grab it. I note that UTF8Reader#fillBuffer does call EmbedConnection#setupContextStack, cf the fact that lcc is set when retrieved via getContext from RAFContainer4. Additionally, the Clob code doesn't ever call ConnectionChild#handleException to clean up errors (close connection or shut down database) - EmbedClob does, at least in some cases. In the above case, this meant that an error thrown from RAFContainer4 (FILE_IO_INTERRUPTED, database level), seen wrapped as an IOException by the Clob#read call, did not lead to Derby shutting down until the next operation against the Connection was initiated and hit a problem.. Is this the way it is intended to be? I think the CLOB code should both synchronize and handle errors as the rest of the JDBC code does?
          Hide
          Kristian Waagan added a comment -

          Dag, the following JIRA is probably relevant: DERBY-2762

          Show
          Kristian Waagan added a comment - Dag, the following JIRA is probably relevant: DERBY-2762
          Hide
          Dag H. Wanvik added a comment -

          Thanks, Kristian. Linked to DERBY-2762.

          Show
          Dag H. Wanvik added a comment - Thanks, Kristian. Linked to DERBY-2762 .
          Hide
          Mike Matrigali added a comment -

          derby 10.9 triage. no changes.

          Show
          Mike Matrigali added a comment - derby 10.9 triage. no changes.

            People

            • Assignee:
              Unassigned
              Reporter:
              Malchenko Inna
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:

                Development