Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.0.0
    • Fix Version/s: 5.0.0
    • Component/s: None
    • Labels:
      None

      Description

      The issue is described in this thread:

      http://www.nabble.com/Blocking-on-UsageManager.waitForSpace-again-tf3031460.html

      Basically, a slow consumer can cause a producer to block on UsageManager.waitForSpace indefinately.

      A basic code sample to reproduce the problem is provided here:

      http://www.nabble.com/Re%3A-Blocking-on-UsageManager.waitForSpace-again-p8521205.html

      Note that the reproducibility of the bug depends on the initial and maximum size of the VM's heap.

      1. ActiveMqBrokerThreadDump.txt
        37 kB
        Joe White
      2. SlowReceive.zip
        2 kB
        Tom Vijlbrief
      3. SlowReceive2.zip
        2 kB
        Tom Vijlbrief
      4. SlowReceive3.zip
        2 kB
        Tom Vijlbrief

        Issue Links

          Activity

          Hide
          Joe White added a comment -

          I am having the exact issue described by Albert with the UsageManager.waitForSpace. The differences in my system are that I have multiple producers and multiple consumers. It is running in Java 1.5_09 on Red Hat.

          The attached file contains a thread dump for the ActiveMq Broker obtained usign 'kill -QUIT [PID]' after the lockup has happened. Each of the transport threads is blockin on UsageManager.waitForSpace.

          Show
          Joe White added a comment - I am having the exact issue described by Albert with the UsageManager.waitForSpace. The differences in my system are that I have multiple producers and multiple consumers. It is running in Java 1.5_09 on Red Hat. The attached file contains a thread dump for the ActiveMq Broker obtained usign 'kill -QUIT [PID] ' after the lockup has happened. Each of the transport threads is blockin on UsageManager.waitForSpace.
          Hide
          Joe White added a comment -

          More on the waitForSpace broker lock down.

          JConsole indicates that the broker JVM has plenty of heap space remaining when the waitForSpace message is called. In my case the lockup occurs after a certain total message volume has been reached. The total volume is approximately:

          Received - 8600
          Sent -17000

          The lockup occurs reliably within +/- 500 messages of that volume. The message size is approximately 2kb. The broker has a maximum heap size of 512M.

          I have attempted to fix this issue by modifying the pendingMessageLimitStrategy, however it had no effect on the lockup. Specifically I added
          <constantPendingMessageLimitStrategy limit=10"/>

          to the policy configuration for all of my topics. I also have tried true and false for persistence.

          I am using ActiveMq as the JMS broker for a mule instance and as far as I know all subscribers are non-durable, though I haven't figured out how to confirm this. DEBUG is enabled on the broker and the UsageManager logs things like:

          DEBUG UsageManager Memory usage change. from: 1, to: 2
          DEBUG UsageManager Memory usage change. from: 1, to: 3
          ...
          DEBUG UsageManager Memory usage change. from: 99, to: 100

          At which point all messaging stops. Is there a configuration change that can be made to eliminate this issue?

          Show
          Joe White added a comment - More on the waitForSpace broker lock down. JConsole indicates that the broker JVM has plenty of heap space remaining when the waitForSpace message is called. In my case the lockup occurs after a certain total message volume has been reached. The total volume is approximately: Received - 8600 Sent -17000 The lockup occurs reliably within +/- 500 messages of that volume. The message size is approximately 2kb. The broker has a maximum heap size of 512M. I have attempted to fix this issue by modifying the pendingMessageLimitStrategy, however it had no effect on the lockup. Specifically I added <constantPendingMessageLimitStrategy limit=10"/> to the policy configuration for all of my topics. I also have tried true and false for persistence. I am using ActiveMq as the JMS broker for a mule instance and as far as I know all subscribers are non-durable, though I haven't figured out how to confirm this. DEBUG is enabled on the broker and the UsageManager logs things like: DEBUG UsageManager Memory usage change. from: 1, to: 2 DEBUG UsageManager Memory usage change. from: 1, to: 3 ... DEBUG UsageManager Memory usage change. from: 99, to: 100 At which point all messaging stops. Is there a configuration change that can be made to eliminate this issue?
          Hide
          Tom Vijlbrief added a comment -

          The hang of a durable subscriber described in AMQ-1150 is
          probably caused by the same problem

          Show
          Tom Vijlbrief added a comment - The hang of a durable subscriber described in AMQ-1150 is probably caused by the same problem
          Hide
          Tom Vijlbrief added a comment -

          Junit test which deadlocks after about 50 sent messages:

          S: 47
          S: 48
          R: 12
          S: 49
          S: 50
          S: 51
          S: 52 (test hangs)

          Show
          Tom Vijlbrief added a comment - Junit test which deadlocks after about 50 sent messages: S: 47 S: 48 R: 12 S: 49 S: 50 S: 51 S: 52 (test hangs)
          Hide
          Tom Vijlbrief added a comment -

          I figured out that the SlowReceive1 junit test probably deadlocks
          because the connection is shared between sender and consumer.

          So I rewrote the test and now the test runs to completion.

          Halfway during the send the Producer slows down and gets
          in sync with each received message.

          When I run my test programs with the tcp broker I still have the
          problem. The sender runs to completion, but the receiver blocks...

          Show
          Tom Vijlbrief added a comment - I figured out that the SlowReceive1 junit test probably deadlocks because the connection is shared between sender and consumer. So I rewrote the test and now the test runs to completion. Halfway during the send the Producer slows down and gets in sync with each received message. When I run my test programs with the tcp broker I still have the problem. The sender runs to completion, but the receiver blocks...
          Hide
          Tom Vijlbrief added a comment -

          This junit test reproduces the hang reliable.

          It has inner loops which create new connections after 25 messages.
          In the second iteration the producer and consumer block.

          Show
          Tom Vijlbrief added a comment - This junit test reproduces the hang reliable. It has inner loops which create new connections after 25 messages. In the second iteration the producer and consumer block.
          Hide
          james strachan added a comment -

          Can you reproduce this issue against the current trunk? (the latest 5.0.0 snapshot?) as we've put in a fair bit of work to try avoid these kinds of issues.

          Show
          james strachan added a comment - Can you reproduce this issue against the current trunk? (the latest 5.0.0 snapshot?) as we've put in a fair bit of work to try avoid these kinds of issues.
          Hide
          Albert Strasheim added a comment -

          The test inside SlowReceive3.zip passes now. I'd like to know what changed, though. How did you manage to fix this issue without running this test?

          Anyway, I'll test AMQ-1148 next, which is probably a related problem.

          Show
          Albert Strasheim added a comment - The test inside SlowReceive3.zip passes now. I'd like to know what changed, though. How did you manage to fix this issue without running this test? Anyway, I'll test AMQ-1148 next, which is probably a related problem.
          Hide
          james strachan added a comment -

          Great stuff; am gonna mark this issue resolved against 5.0.0 unless anyone can find a reason to reopen it.

          Basically one of the big changes in 5.x is the way the broker deals with memory limits; firstly producers and consumers have separate usage manager limits (on a per destination basis if required); which eases some of the old lockups we used to get. Then in addition there's the option of spoling to disk, rather than blocking indefinitely when a slow consumer is around using message cursors.
          http://activemq.apache.org/message-cursors.html

          Finally we have a new producer flow control option...
          http://activemq.apache.org/producer-flow-control.html

          Show
          james strachan added a comment - Great stuff; am gonna mark this issue resolved against 5.0.0 unless anyone can find a reason to reopen it. Basically one of the big changes in 5.x is the way the broker deals with memory limits; firstly producers and consumers have separate usage manager limits (on a per destination basis if required); which eases some of the old lockups we used to get. Then in addition there's the option of spoling to disk, rather than blocking indefinitely when a slow consumer is around using message cursors. http://activemq.apache.org/message-cursors.html Finally we have a new producer flow control option... http://activemq.apache.org/producer-flow-control.html

            People

            • Assignee:
              Unassigned
              Reporter:
              Albert Strasheim
            • Votes:
              4 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development