Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Incomplete
-
5.12.1
-
None
-
RHEL 6.6, java-openjdk-1.7.0 u95
Description
In order to improve CPU usage in a test setup of a network of brokers consisting of 3+ brokers using the following broker configuration
<broker useJmx="${activemq.expose.jmx}" persistent="false" brokerName="${activemq.brokerName}" xmlns="http://activemq.apache.org/schema/core"> <sslContext> <amq:sslContext keyStore="${activemq.broker.keyStore}" keyStorePassword="${activemq.broker.keyStorePassword}" trustStore="${activemq.broker.trustStore}" trustStorePassword="${activemq.broker.trustStorePassword}" /> </sslContext> <systemUsage> <systemUsage> <memoryUsage> <memoryUsage limit="${activemq.memoryUsage}" /> </memoryUsage> <tempUsage> <tempUsage limit="${activemq.tempUsage}" /> </tempUsage> </systemUsage> </systemUsage> <destinationPolicy> <policyMap> <policyEntries> <policyEntry queue=">" enableAudit="false"> <networkBridgeFilterFactory> <conditionalNetworkBridgeFilterFactory replayWhenNoConsumers="true" /> </networkBridgeFilterFactory> </policyEntry> </policyEntries> </policyMap> </destinationPolicy> <networkConnectors> <networkConnector name="queues" uri="static:(${activemq.otherBrokers})" networkTTL="2" dynamicOnly="true" decreaseNetworkConsumerPriority="true" conduitSubscriptions="false"> <excludedDestinations> <topic physicalName=">" /> </excludedDestinations> </networkConnector> <networkConnector name="topics" uri="static:(${activemq.otherBrokers})" networkTTL="1" dynamicOnly="true" decreaseNetworkConsumerPriority="true" conduitSubscriptions="true"> <excludedDestinations> <queue physicalName=">" /> </excludedDestinations> </networkConnector> </networkConnectors> <transportConnectors> <transportConnector uri="${activemq.protocol}${activemq.host}:${activemq.tcp.port}?needClientAuth=true" updateClusterClients="true" rebalanceClusterClients="true" /> <transportConnector uri="${activemq.websocket.protocol}${activemq.websocket.host}:${activemq.websocket.port}?needClientAuth=true" updateClusterClients="true" rebalanceClusterClients="true" /> </transportConnectors> </broker>
with the following placeholder values used:
activemq.tcp.port=9000 activemq.protocol=ssl:// activemq.brokerName=activemq-server1.com activemq.expose.jmx=true activemq.otherBrokers=ssl://server2.com:9000,ssl://server3.com:9000 activemq.websocket.port=9001 activemq.websocket.protocol=stomp+ssl:// activemq.websocket.host=server1.com activemq.memoryUsage=1gb activemq.tempUsage=1gb
We have altered the activemq.protocol placeholder from originally ssl:// to nio+ssl:// and immediately could observe some CPU improvements as hoped for (note the same works with tcp:// and nio://). However after a new deployment of our ActiveMQ and subsequent restart of it we started to encounter weird behavior that some producers would either get timeouts from their request-reply messages or a "unknown destination exception" once the reply is being sent on a temp-queue, the issue only happened when the producer and consumer were connected to different brokers in the network. After some testing we ultimately found out that after a restart often brokers would not start both network bridges, one for queues and one for topics, but rather only one of them. For example in a 3 broker setup each broker usually had 4 network bridges active, 2 for each other broker. However during some restarts we would see any number of active bridges between 2 and 4, no matter the wait time the 2nd bridge to another broker was never started. The logs also showed no output whatsoever, as long as 1 broker was shutdown the other two would output 'connection refused' once he started they would show either 1 or 2 'successfully reconnected' and start exactly this amount of bridges to him.
As soon as we switched back to ssl:// protocol on the transport connector the issue was gone for good, no matter how many restarts always 4 network bridges would be started in each broker. Switching back to nio:// the problem is back right away.
For now we are checking if it is worth configuring an additional TransportConnector running nio:// just for producers and consumers while the network uses the tcp:// connector. The documentations regarding NetworkConnectors usually all use tcp:// or multicast:// (which is not an option for us) in the TransportConnector the bridges attach to, so we are not entirely sure if nio:// is even supposed to work for this case or if this is indeed a bug somewhere.