Uploaded image for project: 'ActiveMQ Classic'
  1. ActiveMQ Classic
  2. AMQ-5712

Broker can deadlock when using queues while producers wait on disk space

Agile BoardAttach filesAttach ScreenshotVotersStop watchingWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 5.11.1
    • 5.13.0
    • Broker
    • None

    Description

      I am experiencing a deadlock when using a Queue with non-persistent messages. The queue has a cursor high memory water mark set (right now at 70%). When a producer is producing messages quickly to the queue and that limit gets hit, the broker can deadlock. I have tried setting producerWindowSize and alwaysSyncSend which did not seem to help. When the broker hits that limit, I am unable to do things like purge the queue. Consumers can also deadlock as well.

      Note that this appears to be the same issue as described in this ticket here: AMQ-2475 . The difference is that I am using a Queue and not a Topic and the fix for this appears to only have been for Topics.

      The problem appears to be in the Queue class on line 1852 inside the cursorAdd method. The method being called is return messages.addMessageLast(msg); which will block indefinitely if there is no space available, which in turn ties up the messagesLock from being used by any other threads. We have seen a deadlock where consumers can't consume because they are waiting on this lock. It looks like in AMQ-2475 part of the fix was to replace messages.addMessageLast(msg) with messages.tryAddMessageLast(msg, 10). I also noticed that not all of the message cursors support tryAddMessageLast, which could be a problem. FilePendingMessageCursor implements it but the rest of the cursors (notably StoreQueueCursor) simply delegate back to addMessageLast in the parent class. So part of this fix may require implementing tryAddMessageLast across more cursors.

      Here is part of the thread dump showing the stuck producer:

      "ActiveMQ Transport: ssl:///192.168.3.142:38589" daemon prio=10 tid=0x00007fb46c006000 nid=0x3b1a runnable [0x00007fb4b8a0d000]
         java.lang.Thread.State: TIMED_WAITING (parking)
              at sun.misc.Unsafe.park(Native Method)
              - parking to wait for  <0x00000000cfb13cd0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
              at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2176)
              at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:103)
              at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:90)
              at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:80)
              at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.tryAddMessageLast(FilePendingMessageCursor.java:235)
              - locked <0x00000000d2015ee0> (a org.apache.activemq.broker.region.cursors.FilePendingMessageCursor)
              at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.addMessageLast(FilePendingMessageCursor.java:207)
              - locked <0x00000000d2015ee0> (a org.apache.activemq.broker.region.cursors.FilePendingMessageCursor)
              at org.apache.activemq.broker.region.cursors.StoreQueueCursor.addMessageLast(StoreQueueCursor.java:97)
              - locked <0x00000000d1f20908> (a org.apache.activemq.broker.region.cursors.StoreQueueCursor)
      

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            cshannon Christopher L. Shannon
            cshannon Christopher L. Shannon
            Votes:
            0 Vote for this issue
            Watchers:
            6 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment