Affects Version/s: 5.13.0
Unmodified release version of ActiveMQ 5.13.0 (http://www.eu.apache.org/dist/activemq/5.13.0/apache-activemq-5.13.0-bin.tar.gz)
On our production ActiveMQ broker (processes around 10 000 messages / sec in average) we have encountered situations where queues started blocking completely after running without problems for a couple of days.
When taking a look at the activemq logs, we can see messages like this (I've changed queue names and client IPs):
2015-12-17 20:52:37,375 | INFO | Usage(default:memory:queue://Consumer.AAA.VirtualTopic.OFFER:memory) percentUsage=100%, usage=104858538, limit=104857600, percentUsageMinDelta=1%;Parent:Usage(default:memory) percentUsage=42%, usage=305669289, limit=720424141, percentUsageMinDelta=1%: Usage Manager Memory Limit reached. Producer (ID:ip-172-30-0-97-38230-1450370654525-1:8:1:1) stopped to prevent flooding queue://Consumer.AAA.VirtualTopic.OFFER. See http://activemq.apache.org/producer-flow-control.html for more info (blocking for: 151s) | org.apache.activemq.broker.region.Queue | ActiveMQ Transport: tcp:///22.214.171.124:36128@61616
The strange thing is, when taking a look at the admin interface, there are no messages queued in the above mentioned queue and also purging the queue does not help.
The only thing that works (that we found out so far) (to get the broker process messages again) in the above situation is:
- delete the queue (it then is recreated automatically and works again)
- restart the broker
I have now tried to reproduce the situation locally and come up with a test case that, while I am not sure if that is the exact problem that we face in production, at least produces the same problem as mentioned above. I have noticed that we sometimes have network issues between the clients and the broker and therefore have done something similar in the test code.
The test code launches 4 producing threads and 4 consuming threads. The producers > 1000 messages / sec to the queues and the consumers just read them. Once after a while (every 10 seconds), one of the consuming threads is interrupted and then, with a delay of another 10 seconds, the connection is cleaned up (to free up the allocated messages that are already in the dead connections prefetchBuffer).
When running the test case on a fresh download of activeMQ 5.13.0, it takes a long time until the broker completey blocks, as it takes time for the memory to fill up. However, when checking JMX stats, it is clearly visible, that the following metrics behave strangely:
Both above metrics are quite constant for some time, and then, once a thread gets interrupted and the connection cleaned up, it suddenly increases by couple of mbytes ... then, again, while the consumers and producers work normally, the size is quite constant, until again, a consumer is interrupted, which again increases memory for couple of mbytes ... and this continues until memory is completely full and no messages can pass anymore through the broker.
To speed things up, a lower memory limit can be placed on the queue in the activemq.xml configuration file, which will lead to shorter waiting time before the broker blocks messages on the queue.
Even terminating the client jvm does not free up resources on the broker.