ActiveMQ
  1. ActiveMQ
  2. AMQ-1557

High message frequency causes ActiveMQ to freeze

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.0.0, 5.1.0
    • Fix Version/s: 5.2.0
    • Component/s: None
    • Labels:
      None
    • Environment:

      observed on Windows XP Professional SP2, Sun JDK 1.5.01_11, but probably affects all platforms

      Description

      ActiveMQ freezes in a test setup with very high message frequency of persistent messages and multiple producers and consumers running on multiple machines:

      • Host 1: Multiple Producers send messages very fast to queue1
      • Host 2: Multiple Processors consume message from queue1 and send a new message for each consumed message to queue2, also very fast.
      • Host 3: ActiveMQ server and 1 consumer reads messages from queue2.

      With producerFlowControl enabled, the system freezes very soon (a bit more than 1000 messages).
      With producerFlowControl disabled, it seems to run longer at first, but it still freezes later.
      Using transactions or not does not seem to make a difference.

      See this thread in user mailing list for more details:
      http://www.nabble.com/High-message-frequency-causes-ActiveMQ-to-freeze-to14919292s2354.html

      See attachment for code of test cases and the batch files to start them.

      1. stresstest.zip
        13 kB
        Jürgen Schumacher

        Activity

        Hide
        Marco Buss added a comment -

        I can see the same with topics async send and producer flow control. The persistence store was derby and mysql.
        I used 3 Topics with 9 diferent subscribers for each topic. 5 of this subscribers were slow (they pause for 60 seconds after every message). All looks fine as long as the topic reaches 1000 Messages. Then the cpu amount used increse to 90% and no message can be delivered, but the Consumers already get the Messages.

        Show
        Marco Buss added a comment - I can see the same with topics async send and producer flow control. The persistence store was derby and mysql. I used 3 Topics with 9 diferent subscribers for each topic. 5 of this subscribers were slow (they pause for 60 seconds after every message). All looks fine as long as the topic reaches 1000 Messages. Then the cpu amount used increse to 90% and no message can be delivered, but the Consumers already get the Messages.
        Hide
        Grant Smith added a comment -

        This still occurs with today's SNAPSHOT. Any updates on getting this fixed ?

        Show
        Grant Smith added a comment - This still occurs with today's SNAPSHOT. Any updates on getting this fixed ?
        Hide
        Grant Smith added a comment -

        For me I'm getting :

        ERROR RecoveryListenerAdapter - Message id ID:gsdev2.marathon-man.com-58448-1207350845238-0:0:1:1:2122 could not be recovered from the data store - already dispatched
        ERROR RecoveryListenerAdapter - Message id ID:gsdev2.marathon-man.com-58448-1207350845238-0:0:1:1:2123 could not be recovered from the data store - already dispatched

        multiple times under high bandwidth conditions. It appears to be a manifestation of

        https://issues.apache.org/activemq/browse/AMQ-1544 which appears to NOT be resolved ?

        Show
        Grant Smith added a comment - For me I'm getting : ERROR RecoveryListenerAdapter - Message id ID:gsdev2.marathon-man.com-58448-1207350845238-0:0:1:1:2122 could not be recovered from the data store - already dispatched ERROR RecoveryListenerAdapter - Message id ID:gsdev2.marathon-man.com-58448-1207350845238-0:0:1:1:2123 could not be recovered from the data store - already dispatched multiple times under high bandwidth conditions. It appears to be a manifestation of https://issues.apache.org/activemq/browse/AMQ-1544 which appears to NOT be resolved ?
        Hide
        Rob Davies added a comment -

        Before we even schedule this issue - the attachment needs to be Granted to the ASF (tick the check box) and all files should be Apache licenced - could you delete and reapply?

        Show
        Rob Davies added a comment - Before we even schedule this issue - the attachment needs to be Granted to the ASF (tick the check box) and all files should be Apache licenced - could you delete and reapply?
        Hide
        Jürgen Schumacher added a comment -

        re-attached with "Grant license..."

        Show
        Jürgen Schumacher added a comment - re-attached with "Grant license..."
        Hide
        Filip Hanik added a comment -

        I'm able to consistently reproduce the error with 5.0.0 and 5.1.0
        I haven't been able to try it out trunk, since it doesn't build when I check it out.

        java.lang.OutOfMemoryError: Java heap space
        Dumping heap to java_pid3448.hprof ...
        Heap dump file created [212044211 bytes in 14.203 secs]
        Exception in thread "ActiveMQ Transport: tcp:///127.0.0.1:1206" java.lang.OutOfMemoryError: Java heap space
        at org.apache.activemq.openwire.v3.BaseDataStreamMarshaller.tightUnmarshalByteSequence(BaseDataStreamMarshaller.
        java:441)
        at org.apache.activemq.openwire.v3.MessageMarshaller.tightUnmarshal(MessageMarshaller.java:68)
        at org.apache.activemq.openwire.v3.ActiveMQMessageMarshaller.tightUnmarshal(ActiveMQMessageMarshaller.java:67)
        at org.apache.activemq.openwire.v3.ActiveMQTextMessageMarshaller.tightUnmarshal(ActiveMQTextMessageMarshaller.ja
        va:67)
        at org.apache.activemq.openwire.OpenWireFormat.doUnmarshal(OpenWireFormat.java:361)
        at org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:275)
        at org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:192)
        at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:184)
        at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:172)
        at java.lang.Thread.run(Thread.java:595)

        Here is the info

        1. Configuration file
        http://people.apache.org/~fhanik/activemq.xml

        2. Modified example code
        http://people.apache.org/~fhanik/example.zip

        Steps to reproduce
        1. Apply the activemq.xml file
        2. start the activemq server
        3. start the consumer 'ant consumer' in the example directory (from example.zip)
        4. start the producer 'ant producer' in the example directory (from example.zip)

        Just sit and watch, eventually it starts slowing down, and finally it throws an OutOfMemoryError

        As long as the consumer is connected, it's not skipping or discarding any messages.
        I was expecting the consumer to receive only the latst, but as long as the consumer is connected, not a single message gets discarded, or at least it appears that way, until the server runs out of java heap.

        Filip

        Show
        Filip Hanik added a comment - I'm able to consistently reproduce the error with 5.0.0 and 5.1.0 I haven't been able to try it out trunk, since it doesn't build when I check it out. java.lang.OutOfMemoryError: Java heap space Dumping heap to java_pid3448.hprof ... Heap dump file created [212044211 bytes in 14.203 secs] Exception in thread "ActiveMQ Transport: tcp:///127.0.0.1:1206" java.lang.OutOfMemoryError: Java heap space at org.apache.activemq.openwire.v3.BaseDataStreamMarshaller.tightUnmarshalByteSequence(BaseDataStreamMarshaller. java:441) at org.apache.activemq.openwire.v3.MessageMarshaller.tightUnmarshal(MessageMarshaller.java:68) at org.apache.activemq.openwire.v3.ActiveMQMessageMarshaller.tightUnmarshal(ActiveMQMessageMarshaller.java:67) at org.apache.activemq.openwire.v3.ActiveMQTextMessageMarshaller.tightUnmarshal(ActiveMQTextMessageMarshaller.ja va:67) at org.apache.activemq.openwire.OpenWireFormat.doUnmarshal(OpenWireFormat.java:361) at org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:275) at org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:192) at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:184) at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:172) at java.lang.Thread.run(Thread.java:595) Here is the info 1. Configuration file http://people.apache.org/~fhanik/activemq.xml 2. Modified example code http://people.apache.org/~fhanik/example.zip Steps to reproduce 1. Apply the activemq.xml file 2. start the activemq server 3. start the consumer 'ant consumer' in the example directory (from example.zip) 4. start the producer 'ant producer' in the example directory (from example.zip) Just sit and watch, eventually it starts slowing down, and finally it throws an OutOfMemoryError As long as the consumer is connected, it's not skipping or discarding any messages. I was expecting the consumer to receive only the latst, but as long as the consumer is connected, not a single message gets discarded, or at least it appears that way, until the server runs out of java heap. Filip
        Hide
        Filip Hanik added a comment -

        If I reconfigure the dead letter strategy
        <deadLetterStrategy>
        <sharedDeadLetterStrategy processExpired="false" processNonPersistent="false" />
        </deadLetterStrategy>

        and turn off the flooding of the logs that it causes

        log4j.logger.org.apache.activemq.broker.region.RegionBroker=ERROR

        The system seems to stay up, in this case, my client (consumer) OOMs, and during the time it OOMS, the producer freezes up again,
        it seems that the producer is synchronized on a connected consumer, and there is no asynchronicity between the two

        Filip

        Show
        Filip Hanik added a comment - If I reconfigure the dead letter strategy <deadLetterStrategy> <sharedDeadLetterStrategy processExpired="false" processNonPersistent="false" /> </deadLetterStrategy> and turn off the flooding of the logs that it causes log4j.logger.org.apache.activemq.broker.region.RegionBroker=ERROR The system seems to stay up, in this case, my client (consumer) OOMs, and during the time it OOMS, the producer freezes up again, it seems that the producer is synchronized on a connected consumer, and there is no asynchronicity between the two Filip
        Hide
        Rob Davies added a comment -

        Resolved in 5.2

        Show
        Rob Davies added a comment - Resolved in 5.2

          People

          • Assignee:
            Rob Davies
            Reporter:
            Jürgen Schumacher
          • Votes:
            8 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development