Uploaded image for project: 'ActiveMQ'
  1. ActiveMQ
  2. AMQ-6062

Broker goes 100% CPU on multi-queue command line browse

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 5.9.1, 5.12.0
    • Fix Version/s: 5.12.2, 5.13.1, 5.14.0
    • Component/s: Broker
    • Labels:
      None
    • Environment:

      Linux Ubuntu, Aix, with both Java 6 and Java 7 (either IBM or Oracle)

      Description

      Hello,
      The Broker may enter a state of high CPU consumption (100% or 200% if 2 browse enter the infinite loop in the same time on a multi-core) when a command line browse on multiple queues is issued. When that arrives the only way we found to restore the situation was to kill the broker and restart because he was not responding anymore to clients. Killing the clients and the browse process does not lower the CPU usage of the broker.
      I assume the infinite loop based on successive thread dumps that show the browse thread remaines in the same code zone:

      Here is an example of the Browse thread call stack trace, when Broker is in high CPU usage state:

      at java/util/HashMap.rehash(HashMap.java:782(Compiled Code))
      at java/util/HashMap.rehash(HashMap.java:819(Compiled Code))
      at java/util/HashMap.putImpl(HashMap.java:702(Compiled Code))
      at java/util/HashMap.put(HashMap.java:680(Compiled Code))
      at org/apache/activemq/broker/region/QueueBrowserSubscription.isDuplicate(QueueBrowserSubscription.java:72(Compiled Code))
      at org/apache/activemq/broker/region/Queue.iterate(Queue.java:1688(Compiled Code))
      at org/apache/activemq/thread/PooledTaskRunner.runTask(PooledTaskRunner.java:133(Compiled Code)).

      Tested ActiveMQ versions: 5.9.1 and the last, 5.12.0, but of course I suppose all in beetwen are affected.

      The cause seems to be the non-threadsafe usage of the audit HashMap member of QueueBrowserSubscription.java class.
      Here are some facts on non thread-safe HashMap use problems:
      High CPU / Hang on java.util.HashMap.findNonNullKeyEntry() due to non-thread-safe usage of HashMap
      http://www-01.ibm.com/support/docview.wss?uid=swg21597581
      IZ73767: INFINITE LOOP CAN OCCUR IN HASHMAP
      http://www-01.ibm.com/support/docview.wss?uid=swg1IZ73767

      I must underline that this is not an IBM JDK specific problem as the IBM JDK is based on Oracle JDK and changing the JDK or JRE is not a solution.

      The scenario may be produced with a fresh non modified install (default config) of ActiveMQ broker:
      1. set ACTIVEMQ_HOME, JAVA_HOME and PATH to point accordingly to the right activemq, and Java version.
      2. cd $ACTIVEMQ_HOME/bin; ./activemq start
      3. inject at least 2000 messages in each of two ques, say QA.ONE and QA.TWO
      4. $ $ACTIVEMQ_HOME/bin/activemq browse --view JMSDestination,JMSMessageID --amqurl tcp://localhost:61616 QA.*

      The simplest fix that I suggest is to change in the QueueBrowserSubscription.java class the type of audit member to ConcurrentHashMap.

        Attachments

        1. ActiveMQ_BrowseInfiniteLoopUseCase.jmx
          14 kB
          Dumitru HUSLEAG
        2. activemq-test.log
          10 kB
          Dumitru HUSLEAG
        3. AMQ6062Test.java
          9 kB
          Dumitru HUSLEAG
        4. HotSpotThreadDump.txt
          47 kB
          Dumitru HUSLEAG
        5. IBMjava7AndAMQMaster.javacore.20151202.161643.18441.0001.txt
          432 kB
          Dumitru HUSLEAG
        6. ThreadStatusAnalysis_2015-11-25_14h13m33s.png
          264 kB
          Dumitru HUSLEAG

          Activity

            People

            • Assignee:
              cshannon Christopher L. Shannon
              Reporter:
              doomytroo Dumitru HUSLEAG
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 24h
                24h
                Remaining:
                Remaining Estimate - 24h
                24h
                Logged:
                Time Spent - Not Specified
                Not Specified