Commons Dbcp
  1. Commons Dbcp
  2. DBCP-65

[dbcp] Deadlock when evicting dbcp objects (testWhileIdle=true)

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.2.2
    • Labels:
      None
    • Environment:

      Operating System: All
      Platform: All

      Description

      The GenericKeyedObjectPool$Evictor thread has probability of deadlock with dbcp
      objects.

      For example, at a certain time, a normal user thread calls a PoolingConnection
      object's synchronized method, which in turn calls GenericKeyedObjectPool
      object's synchronzied method.

      At the same time, the evictor thread calls the GenericKeyedObjectPool object's
      synchronized method 'evict', which in turn calls the PoolingConnection object's
      synchronized method. When testWhileIdle=true, the probability of evictor
      calling GenericKeyedObjectPool's synchronized method is much greater.

      The following is the deadlock information in the thread dump of our program
      (testWhileIdle=true):

      "Thread-106":
      at org.apache.commons.dbcp.AbandonedTrace.removeTrace(AbandonedTrace.java:221)

      • waiting to lock <0x6a6b1b80> (a org.apache.commons.dbcp.PoolingConnection)
        at
        org.apache.commons.dbcp.PoolablePreparedStatement.passivate(PoolablePreparedStatement.java:100)
        at
        org.apache.commons.dbcp.PoolingConnection.passivateObject(PoolingConnection.java:239)
        at
        org.apache.commons.pool.impl.GenericKeyedObjectPool.evict(GenericKeyedObjectPool.java:1001)
      • locked <0x6a6b1858> (a org.apache.commons.pool.impl.GenericKeyedObjectPool)
        at
        org.apache.commons.pool.impl.GenericKeyedObjectPool$Evictor.run(GenericKeyedObjectPool.java:1156)
        at java.lang.Thread.run(Thread.java:534)
        "Thread-41":
        at
        org.apache.commons.pool.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:715)
      • waiting to lock <0x6a6b1858> (a
        org.apache.commons.pool.impl.GenericKeyedObjectPool)
        at
        org.apache.commons.dbcp.PoolingConnection.prepareStatement(PoolingConnection.java:87)
      • locked <0x6a6b1b80> (a org.apache.commons.dbcp.PoolingConnection)
        at
        org.apache.commons.dbcp.DelegatingConnection.prepareStatement(DelegatingConnection.java:185)
        at
        org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.prepareStatement(PoolingDataSource.java:278)
        ...

      This problem can be worked around by setting testWhileIdle to false, although
      there is still very small possibility of deadlock.

        Issue Links

          Activity

          Hide
          Sandy McArthur (from Bugzilla import) added a comment -

          Looking at GenericKeyedObjectPool.Evictor and PoolingConnection I think this is
          a problem with PoolingConnection because it is both a client for
          GenericKeyedObjectPool and passes an instance of itself as the
          KeyedPoolableObjectFactory for GenericKeyedObjectPool.

          I'm not sure GenericKeyedObjectPool can be changed and remain thread safe to fix
          this situation. I think PoolingConnection needs to be changed so it aquires
          synchronization lock in the order of the GenericKeyedObjectPool instance first
          and then itself.

          Does an apache dev agree with me and want to punt this off to Dbcp?

          Show
          Sandy McArthur (from Bugzilla import) added a comment - Looking at GenericKeyedObjectPool.Evictor and PoolingConnection I think this is a problem with PoolingConnection because it is both a client for GenericKeyedObjectPool and passes an instance of itself as the KeyedPoolableObjectFactory for GenericKeyedObjectPool. I'm not sure GenericKeyedObjectPool can be changed and remain thread safe to fix this situation. I think PoolingConnection needs to be changed so it aquires synchronization lock in the order of the GenericKeyedObjectPool instance first and then itself. Does an apache dev agree with me and want to punt this off to Dbcp?
          Hide
          Sandy McArthur (from Bugzilla import) added a comment -

          After sleeping on it I'm convinced this is a DBCP problem.

          DBCP needs to be careful about syncronization on it KeyedPoolableObjectFactory
          implementations. Specificly it needs to pay attention to the fact that the
          ObjectPool's evictor aquires the sync lock for the ObjectPool first and then may
          call the following methods in KeyedPoolableObjectFactory: activateObject,
          validateObject, and passivateObject. I'm not familiar with DBCP else I'd try to
          be more specific.

          Show
          Sandy McArthur (from Bugzilla import) added a comment - After sleeping on it I'm convinced this is a DBCP problem. DBCP needs to be careful about syncronization on it KeyedPoolableObjectFactory implementations. Specificly it needs to pay attention to the fact that the ObjectPool's evictor aquires the sync lock for the ObjectPool first and then may call the following methods in KeyedPoolableObjectFactory: activateObject, validateObject, and passivateObject. I'm not familiar with DBCP else I'd try to be more specific.
          Hide
          Phil Steitz added a comment -

          I cannot find a backward compatible way to fix this.

          Show
          Phil Steitz added a comment - I cannot find a backward compatible way to fix this.
          Hide
          Sandy McArthur added a comment -

          I think the only backward compatible way to fix this is to drop Pool as a dependency and pull GOP into DBCP and make it a DBCP specific implementations. This would be kinda funny since, as I understand it, Pool was birthed from DBCP long ago.

          The other choices would be to:

          1. Extend PoolableObjectFactory with a subinterface that provides hooks for the pool's eviction thread to acquire the needed locks before acquiring the internal pool locks. I think this would be very hard to do in the existing pool implementations in a backwards compatible way, but there is the new composite pool implementation that hasn't seen an official release and we could make this work with that in Pool 2.

          2. Simply make the DBCP config options that cause an eviction thread to be created be no-ops and provide a way to enable the erodingPool decorator that will be in Pool 2.

          3. Perform deep surgery on DBCP so it works with existing pool implementations but, as you said, this probably won't be backward compatible and I think it's a bit of the tail wagging the dog.

          I personally think #2 is the best choice but I'll work with you to make #1 happen. A lot of users think they need an evictor even though I maintain that is a risky way to use Pool and should be discouraged.

          Show
          Sandy McArthur added a comment - I think the only backward compatible way to fix this is to drop Pool as a dependency and pull GOP into DBCP and make it a DBCP specific implementations. This would be kinda funny since, as I understand it, Pool was birthed from DBCP long ago. The other choices would be to: 1. Extend PoolableObjectFactory with a subinterface that provides hooks for the pool's eviction thread to acquire the needed locks before acquiring the internal pool locks. I think this would be very hard to do in the existing pool implementations in a backwards compatible way, but there is the new composite pool implementation that hasn't seen an official release and we could make this work with that in Pool 2. 2. Simply make the DBCP config options that cause an eviction thread to be created be no-ops and provide a way to enable the erodingPool decorator that will be in Pool 2. 3. Perform deep surgery on DBCP so it works with existing pool implementations but, as you said, this probably won't be backward compatible and I think it's a bit of the tail wagging the dog. I personally think #2 is the best choice but I'll work with you to make #1 happen. A lot of users think they need an evictor even though I maintain that is a risky way to use Pool and should be discouraged.
          Hide
          Phil Steitz added a comment -

          I am +1 on option 2, or just migrate to Pool 2 when it is released. Having spent many hours trying to find workarounds for this and DBCP-44 (trying to do things like 1 outside of [pool], which is hopeless), I agree that the current evictor setup is a recipe for deadlocks and can't be fixed without changes to [pool] or [dbcp] config semantics.

          Show
          Phil Steitz added a comment - I am +1 on option 2, or just migrate to Pool 2 when it is released. Having spent many hours trying to find workarounds for this and DBCP-44 (trying to do things like 1 outside of [pool] , which is hopeless), I agree that the current evictor setup is a recipe for deadlocks and can't be fixed without changes to [pool] or [dbcp] config semantics.
          Hide
          Phil Steitz added a comment -

          Unless someone can give a good reason that PoolingConnection's prepareStatement methods must be synchronized, I am inclined to remove the synchronization from these methods. That sounds radical, but observe that all we have in these methods is:q
          try

          { return(PreparedStatement)(_pstmtPool.borrowObject(createKey(sql))); }

          catch(NoSuchElementException e)

          { throw new SQLNestedException("MaxOpenPreparedStatements limit reached", e); }

          catch(RuntimeException e)

          { throw e; }

          catch(Exception e)

          { throw new SQLNestedException("Borrow prepareStatement from pool failed", e); }

          The borrowObject pool method is synchronized and the cause of the deadlocks here and in DBCP-202 is that these methods lock the PoolingConnection first and then the pool. I thought first about changing the order explicitly by replacing the outer synchronization with
          synchronized (_pstmtPool) {
          synchronized (this) {
          ...
          to force consistent lock order, but then could not see any reason for the inner synch, and the other one is already there in the pool method.
          Can anyone see why these methods need to be synchronized on the PoolingConnection? If this seems too high-risk, how about the explicit lock ordering above?
          Removing or altering synchronization lock order should resolve DBCP-202 as well.

          Show
          Phil Steitz added a comment - Unless someone can give a good reason that PoolingConnection's prepareStatement methods must be synchronized, I am inclined to remove the synchronization from these methods. That sounds radical, but observe that all we have in these methods is:q try { return(PreparedStatement)(_pstmtPool.borrowObject(createKey(sql))); } catch(NoSuchElementException e) { throw new SQLNestedException("MaxOpenPreparedStatements limit reached", e); } catch(RuntimeException e) { throw e; } catch(Exception e) { throw new SQLNestedException("Borrow prepareStatement from pool failed", e); } The borrowObject pool method is synchronized and the cause of the deadlocks here and in DBCP-202 is that these methods lock the PoolingConnection first and then the pool. I thought first about changing the order explicitly by replacing the outer synchronization with synchronized (_pstmtPool) { synchronized (this) { ... to force consistent lock order, but then could not see any reason for the inner synch, and the other one is already there in the pool method. Can anyone see why these methods need to be synchronized on the PoolingConnection? If this seems too high-risk, how about the explicit lock ordering above? Removing or altering synchronization lock order should resolve DBCP-202 as well.
          Hide
          Phil Steitz added a comment -

          Synchronization was removed from the perpareStatement methods per last comment in r498524.

          Show
          Phil Steitz added a comment - Synchronization was removed from the perpareStatement methods per last comment in r498524.
          Hide
          David Rees added a comment -

          This bug still exists in 1.2.2. I don't think that removing the prepareStatement synchronization helps, there is still a lock-race between the Evictor and returning objects to the pool for example.

          Here is a stack trace, I've had this happen twice in the last week on two completely different applications (after using dbcp successfully without issue for years!):

          "Timer-0":
          at org.apache.commons.dbcp.AbandonedTrace.addTrace(AbandonedTrace.java:175)
          waiting to lock <0x0000002aa04d4be8> (a org.apache.commons.dbcp.PoolableConnection)
          at org.apache.commons.dbcp.AbandonedTrace.init(AbandonedTrace.java:92)
          at org.apache.commons.dbcp.AbandonedTrace.<init>(AbandonedTrace.java:82)
          at org.apache.commons.dbcp.DelegatingStatement.<init>(DelegatingStatement.java:61)
          at org.apache.commons.dbcp.DelegatingConnection.createStatement(DelegatingConnection.java:224)
          at org.apache.commons.dbcp.PoolableConnectionFactory.validateConnection(PoolableConnectionFactory.java:331)
          at org.apache.commons.dbcp.PoolableConnectionFactory.validateObject(PoolableConnectionFactory.java:312)
          at org.apache.commons.pool.impl.GenericObjectPool.evict(GenericObjectPool.java:1217)
          locked <0x0000002a9ee91bf8> (a org.apache.commons.pool.impl.GenericObjectPool)
          at org.apache.commons.pool.impl.GenericObjectPool$Evictor.run(GenericObjectPool.java:1341)
          at java.util.TimerThread.mainLoop(Timer.java:512)
          at java.util.TimerThread.run(Timer.java:462)
          "SocketRequest-8":
          at org.apache.commons.pool.impl.GenericObjectPool.addObjectToPool(GenericObjectPool.java:1137)
          waiting to lock <0x0000002a9ee91bf8> (a org.apache.commons.pool.impl.GenericObjectPool)
          at org.apache.commons.pool.impl.GenericObjectPool.returnObject(GenericObjectPool.java:1076)
          at org.apache.commons.dbcp.PoolableConnection.close(PoolableConnection.java:87)
          locked <0x0000002aa04d4be8> (a org.apache.commons.dbcp.PoolableConnection)
          at org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.close(PoolingDataSource.java:181)
          at my.package.DbUtil.close(DbUtil.java:545)

          In this configuration, minIdle=0, maxIdle=4, maxActive=8, testWhileIdle=true timeBetweenEvictionRunsMillis=15000

          It appears that the only workaround for now to prevent this deadlock is to disable the Evictor thread by disabling testWhileIdle.

          I have two comments on DBCP-44 with other information (before I realized that this was the bug I was hitting, though they are related).

          Show
          David Rees added a comment - This bug still exists in 1.2.2. I don't think that removing the prepareStatement synchronization helps, there is still a lock-race between the Evictor and returning objects to the pool for example. Here is a stack trace, I've had this happen twice in the last week on two completely different applications (after using dbcp successfully without issue for years!): "Timer-0": at org.apache.commons.dbcp.AbandonedTrace.addTrace(AbandonedTrace.java:175) waiting to lock <0x0000002aa04d4be8> (a org.apache.commons.dbcp.PoolableConnection) at org.apache.commons.dbcp.AbandonedTrace.init(AbandonedTrace.java:92) at org.apache.commons.dbcp.AbandonedTrace.<init>(AbandonedTrace.java:82) at org.apache.commons.dbcp.DelegatingStatement.<init>(DelegatingStatement.java:61) at org.apache.commons.dbcp.DelegatingConnection.createStatement(DelegatingConnection.java:224) at org.apache.commons.dbcp.PoolableConnectionFactory.validateConnection(PoolableConnectionFactory.java:331) at org.apache.commons.dbcp.PoolableConnectionFactory.validateObject(PoolableConnectionFactory.java:312) at org.apache.commons.pool.impl.GenericObjectPool.evict(GenericObjectPool.java:1217) locked <0x0000002a9ee91bf8> (a org.apache.commons.pool.impl.GenericObjectPool) at org.apache.commons.pool.impl.GenericObjectPool$Evictor.run(GenericObjectPool.java:1341) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) "SocketRequest-8": at org.apache.commons.pool.impl.GenericObjectPool.addObjectToPool(GenericObjectPool.java:1137) waiting to lock <0x0000002a9ee91bf8> (a org.apache.commons.pool.impl.GenericObjectPool) at org.apache.commons.pool.impl.GenericObjectPool.returnObject(GenericObjectPool.java:1076) at org.apache.commons.dbcp.PoolableConnection.close(PoolableConnection.java:87) locked <0x0000002aa04d4be8> (a org.apache.commons.dbcp.PoolableConnection) at org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.close(PoolingDataSource.java:181) at my.package.DbUtil.close(DbUtil.java:545) In this configuration, minIdle=0, maxIdle=4, maxActive=8, testWhileIdle=true timeBetweenEvictionRunsMillis=15000 It appears that the only workaround for now to prevent this deadlock is to disable the Evictor thread by disabling testWhileIdle. I have two comments on DBCP-44 with other information (before I realized that this was the bug I was hitting, though they are related).
          Hide
          Jan Novotný added a comment -

          I confirmt, that deadlock persists even in 1.2.2 version. Testing web application under a high load it will with 100% probability stuck at return object method:

          at org.apache.commons.pool.impl.GenericKeyedObjectPool.returnObject (GenericKeyedObjectPool.java:872)
          at org.apache.commons.dbcp.datasources.KeyedCPDSConnectionFactory.connectionClosed (KeyedCPDSConnectionFactory.java:268)
          at com.mysql.jdbc.jdbc2.optional.MysqlPooledConnection.callListener (MysqlPooledConnection.java:209)
          at com.mysql.jdbc.jdbc2.optional.ConnectionWrapper.close (ConnectionWrapper.java:857)
          at com.mysql.jdbc.jdbc2.optional.ConnectionWrapper.close (ConnectionWrapper.java:480)

          Show
          Jan Novotný added a comment - I confirmt, that deadlock persists even in 1.2.2 version. Testing web application under a high load it will with 100% probability stuck at return object method: at org.apache.commons.pool.impl.GenericKeyedObjectPool.returnObject (GenericKeyedObjectPool.java:872) at org.apache.commons.dbcp.datasources.KeyedCPDSConnectionFactory.connectionClosed (KeyedCPDSConnectionFactory.java:268) at com.mysql.jdbc.jdbc2.optional.MysqlPooledConnection.callListener (MysqlPooledConnection.java:209) at com.mysql.jdbc.jdbc2.optional.ConnectionWrapper.close (ConnectionWrapper.java:857) at com.mysql.jdbc.jdbc2.optional.ConnectionWrapper.close (ConnectionWrapper.java:480)

            People

            • Assignee:
              Unassigned
              Reporter:
              Wang Xianzhu
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development