Uploaded image for project: 'ActiveMQ Classic'
  1. ActiveMQ Classic
  2. AMQ-4274

Potential deadlock between FailoverTransport and AbstractInactivityMonitor

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 5.7.0
    • 5.8.0
    • Transport
    • None

    Description

      Its possible for an operation that's doing a send via oneway in FailoverTransport to deadlock with a Keep Alive write in the Inactivity Monitor.

      Found one Java-level deadlock:
      =============================
      "U.GeoCodingIncBuilder.1":
        waiting for ownable synchronizer 0x00002aaac04e29e8, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
        which is held by "ActiveMQ Session Task-42904"
      "ActiveMQ Session Task-42904":
        waiting for ownable synchronizer 0x00002ab3797e7348, (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
        which is held by "ActiveMQ InactivityMonitor Worker"
      "ActiveMQ InactivityMonitor Worker":
        waiting to lock monitor 0x00002ab729f36a70 (object 0x00002aaac04f11d8, a java.lang.Object),
        which is held by "ActiveMQ Session Task-42904"
      
      Java stack information for the threads listed above:
      ===================================================
      "U.GeoCodingIncBuilder.1":
      	at sun.misc.Unsafe.park(Native Method)
      	- parking to wait for  <0x00002aaac04e29e8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114)
      	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
      	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
      	at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:66)
      	at org.apache.activemq.transport.ResponseCorrelator.oneway(ResponseCorrelator.java:60)
      	at org.apache.activemq.ActiveMQConnection.doAsyncSendPacket(ActiveMQConnection.java:1290)
      	at org.apache.activemq.ActiveMQConnection.asyncSendPacket(ActiveMQConnection.java:1284)
      	at org.apache.activemq.ActiveMQSession.<init>(ActiveMQSession.java:252)
      	at org.apache.activemq.ActiveMQConnection.createSession(ActiveMQConnection.java:332)
      	at linqmap.ipc.impl.jms.SingleJmsFactory.createSession(SingleJmsFactory.java:492)
      	at linqmap.ipc.impl.jms.SingleJmsFactory.createSendSession(SingleJmsFactory.java:318)
      	at linqmap.ipc.impl.jms.JmsQueue.send(JmsQueue.java:117)
      	at linqmap.ipc.queues.wrappers.AsyncSender$2.run(AsyncSender.java:81)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
      	at java.lang.Thread.run(Thread.java:619)
      "ActiveMQ Session Task-42904":
      	at sun.misc.Unsafe.park(Native Method)
      	- parking to wait for  <0x00002ab3797e7348> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:877)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1197)
      	at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:594)
      	at org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:268)
      	at org.apache.activemq.transport.TransportFilter.oneway(TransportFilter.java:85)
      	at org.apache.activemq.transport.WireFormatNegotiator.oneway(WireFormatNegotiator.java:104)
      	at org.apache.activemq.transport.failover.FailoverTransport.oneway(FailoverTransport.java:640)
      	- locked <0x00002aaac04f11d8> (a java.lang.Object)
      	at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:68)
      	at org.apache.activemq.transport.ResponseCorrelator.oneway(ResponseCorrelator.java:60)
      	at org.apache.activemq.ActiveMQConnection.doAsyncSendPacket(ActiveMQConnection.java:1290)
      	at org.apache.activemq.ActiveMQConnection.asyncSendPacket(ActiveMQConnection.java:1284)
      	at org.apache.activemq.ActiveMQSession.asyncSendPacket(ActiveMQSession.java:1898)
      	at org.apache.activemq.ActiveMQSession.sendAck(ActiveMQSession.java:2064)
      	at org.apache.activemq.ActiveMQSession.sendAck(ActiveMQSession.java:2059)
      	at org.apache.activemq.ActiveMQMessageConsumer.acknowledge(ActiveMQMessageConsumer.java:1061)
      	- locked <0x00002aaac06c7280> (a java.util.LinkedList)
      	at org.apache.activemq.ActiveMQSession.acknowledge(ActiveMQSession.java:1604)
      	at org.apache.activemq.ActiveMQMessageConsumer$1.execute(ActiveMQMessageConsumer.java:552)
      	at org.apache.activemq.command.ActiveMQMessage.acknowledge(ActiveMQMessage.java:97)
      	at linqmap.ipc.impl.jms.JmsQueue.onMessage(JmsQueue.java:262)
      	at org.apache.activemq.ActiveMQMessageConsumer.dispatch(ActiveMQMessageConsumer.java:1321)
      	- locked <0x00002aaac06c0b70> (a java.lang.Object)
      	at org.apache.activemq.ActiveMQSessionExecutor.dispatch(ActiveMQSessionExecutor.java:131)
      	at org.apache.activemq.ActiveMQSessionExecutor.iterate(ActiveMQSessionExecutor.java:202)
      	at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129)
      	at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
      	at java.lang.Thread.run(Thread.java:619)
      "ActiveMQ InactivityMonitor Worker":
      	at org.apache.activemq.transport.failover.FailoverTransport.handleTransportFailure(FailoverTransport.java:252)
      	- waiting to lock <0x00002aaac04f11d8> (a java.lang.Object)
      	at org.apache.activemq.transport.failover.FailoverTransport$3.onException(FailoverTransport.java:209)
      	at org.apache.activemq.transport.TransportFilter.onException(TransportFilter.java:101)
      	at org.apache.activemq.transport.WireFormatNegotiator.onException(WireFormatNegotiator.java:160)
      	at org.apache.activemq.transport.AbstractInactivityMonitor.onException(AbstractInactivityMonitor.java:295)
      	at org.apache.activemq.transport.AbstractInactivityMonitor$3.run(AbstractInactivityMonitor.java:168)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
      	at java.lang.Thread.run(Thread.java:619)
      
      Found 1 deadlock.
      
      

      The deadlock occurs when the write check task detects a failure and calls the onException method while holding the write lock side of the RW lock in the monitor. Since the FailoverTransport holds its reconnect lock for the duration of the oneway call and the onException method of failover transport tries to lock that same mutex things can lock if the oneway call was waiting on the read lock side of the monitors R/W lock.

      The solution is to ensure that we always unlock the writelock before we call the next transports onException method but after we've set the failed flag so that any waiting oneway calls will fail and throw their IOException indicating the transport has already failed. This will free the failover transport up to do its normal failure recovery processing.

      Attachments

        Activity

          People

            tabish Timothy A. Bish
            tabish Timothy A. Bish
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: