Uploaded image for project: 'ActiveMQ Classic'
  1. ActiveMQ Classic
  2. AMQ-5961

Deadlock in client blocks all application server threads

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 5.11.0, 5.11.1
    • 5.13.0
    • JMS client, RAR
    • None
    • Platform independent

    • Patch Available

    Description

      When we run JBoss EAP 6.4 with high load using AMQ 6.2 (ActiveMQ 5.11) everything grinds to a halt after a few hours. We have 170 threads blocked on the same lock:

      "default-threads - 1400" #406720 prio=5 os_prio=0 tid=0x00007f1b8402b800 nid=0xfe2d waiting for monitor entry [0x00007f19ccdd5000]
      java.lang.Thread.State: BLOCKED (on object monitor)
      at org.apache.activemq.TransactionContext.setXid(TransactionContext.java:729)

      • waiting to lock <0x000000063ac688b8> (a java.util.HashMap)
        at org.apache.activemq.TransactionContext.end(TransactionContext.java:418)
        at org.apache.activemq.ra.LocalAndXATransaction.end(LocalAndXATransaction.java:98)
        at org.jboss.jca.core.tx.jbossts.XAResourceWrapperImpl.end(XAResourceWrapperImpl.java:118)
        at com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord.topLevelPrepare(XAResourceRecord.java:208)
        at com.arjuna.ats.arjuna.coordinator.BasicAction.doPrepare(BasicAction.java:2530)
        at com.arjuna.ats.arjuna.coordinator.BasicAction.doPrepare(BasicAction.java:2497)
        at com.arjuna.ats.arjuna.coordinator.BasicAction.prepare(BasicAction.java:2074)
      • locked <0x000000067461c090> (a com.arjuna.ats.internal.jta.transaction.arjunacore.AtomicAction)
        at com.arjuna.ats.arjuna.coordinator.BasicAction.End(BasicAction.java:1485)
      • locked <0x000000067461c090> (a com.arjuna.ats.internal.jta.transaction.arjunacore.AtomicAction)
        at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:98)
        at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:162)

      This lock is owned by a thread that is waiting for another lock:

      "default-threads - 1381" #404073 prio=5 os_prio=0 tid=0x00007f1a6403c000 nid=0x7d49 waiting on condition [0x00007f19d8a91000]
      java.lang.Thread.State: WAITING (parking)
      at sun.misc.Unsafe.park(Native Method)

      • parking to wait for <0x000000063ab2a260> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
        at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
        at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
        at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:66)
        at org.apache.activemq.transport.ResponseCorrelator.oneway(ResponseCorrelator.java:60)
        at org.apache.activemq.ActiveMQConnection.doAsyncSendPacket(ActiveMQConnection.java:1309)
        at org.apache.activemq.ActiveMQConnection.asyncSendPacket(ActiveMQConnection.java:1303)
        at org.apache.activemq.ActiveMQSession.asyncSendPacket(ActiveMQSession.java:2008)
        at org.apache.activemq.ActiveMQSession$5.afterRollback(ActiveMQSession.java:973)
        at org.apache.activemq.TransactionContext.afterRollback(TransactionContext.java:162)
        at org.apache.activemq.TransactionContext.prepare(TransactionContext.java:482)
      • locked <0x000000063ac688b8> (a java.util.HashMap)
        at org.apache.activemq.ra.LocalAndXATransaction.prepare(LocalAndXATransaction.java:130)
        at com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord.topLevelPrepare(XAResourceRecord.java:213)
        at com.arjuna.ats.arjuna.coordinator.BasicAction.doPrepare(BasicAction.java:2530)
        at com.arjuna.ats.arjuna.coordinator.BasicAction.doPrepare(BasicAction.java:2497)

      The locked object is ENDED_XA_TRANSACTION_CONTEXTS in TransactionContext. The thread that has the lock calls afterRollback without first releasing it and afterRollback does not return promptly (not at all in fact, we need to restart the server to fix this). The same can happen with afterCommit, which is also called from within a synchronized block causing a potential deadlock.

      I think I have a fix for this and will create a pull request in github, or if you prefer I can send a patch? It simply means moving the external calls out from the synchronized blocks and would both remove the very real deadlock risks and improve performance.

      Attachments

        1. edited-trace.txt
          1.27 MB
          Erik Wramner

        Activity

          People

            Unassigned Unassigned
            erik.wramner Erik Wramner
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: