ActiveMQ
  1. ActiveMQ
  2. AMQ-3454

Contention on a mutex during a stress when using SimpleAuthenticationPlugin

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 5.5.0
    • Fix Version/s: 5.6.0
    • Component/s: Broker
    • Labels:
      None

      Description

      I am testing ActiveMQ 5.5 with my own stress test. I tried to implement a stress test that use jms ressource in the same fashion that my application would do.

      For that, I used jms template (from Spring) and a pooled connection factory (as recommended). I run a fixed number of thread that plublish on fixed number of topics. Each thread pick up a topic and enter a loop that will send a message on the choosen topic.

      The issue is reproductible with SimpleAuthenticationPlugin active.

      Configuration of the test :

      • 100 topics
      • more than one thread per topic (actually 1500 producer threads)

      Common connection properties :

      • alwaysSessionAsync=false
      • dispatchAsync=false
      • optimizeAcknowledge=true
      • socketBufferSize=131072&trace=true&wireFormat.cacheSize=2048&wireFormat.tcpNoDelayEnabled=true&wireFormat.tightEncodingEnabled=true&keepAlive=true&soTimeout=10000&connectionTimeout=10000\

      Producers :

      • Message are not persistent
      • There is no transaction
      • Message expiration time is 30s
      • Unsing JMS template singleton with a PooledConnectionFactory
      • 3 JVM running
      • connection options alwaysSyncSend=true,copyMessageOnSend=false

      Consumers :

      • just listening to each topic with a SimpleMessageListenerContainer and count messages received
      • 3 JVM running

      In attachment, you will find activemq config file and thread dumps. I can attach the producer/consumers code if you need it but I think you have your own tests.

      This Jira is critical for me because security is mandatory for our use.

      If you need more information, please ask.

      1. amq-config.zip
        2 kB
        William
      2. amq-dump
        235 kB
        William
      3. RWLockDestinations.txt
        16 kB
        Timothy Bish

        Activity

        Hide
        William added a comment -

        Broker configuration used for the test

        Show
        William added a comment - Broker configuration used for the test
        Hide
        Timothy Bish added a comment -

        I was actually looking at that code yesterday and noticed that the locking on the destinationsMutex could result in unnecessary waits. I am attaching a fix that I was testing, if you'd like to try it out just apply it to a trunk checkout and build your own broker.

        Show
        Timothy Bish added a comment - I was actually looking at that code yesterday and noticed that the locking on the destinationsMutex could result in unnecessary waits. I am attaching a fix that I was testing, if you'd like to try it out just apply it to a trunk checkout and build your own broker.
        Hide
        William added a comment -

        I want to test with activemq 5.5 base code so I should checkout the 5.5 tag right (activemq/tags/activemq-5.5.0) ?

        Is it easy for you build a activemq-core-5.5.0.jar with your patch so that I will only have to replace the jar in the broker installation ?

        Show
        William added a comment - I want to test with activemq 5.5 base code so I should checkout the 5.5 tag right (activemq/tags/activemq-5.5.0) ? Is it easy for you build a activemq-core-5.5.0.jar with your patch so that I will only have to replace the jar in the broker installation ?
        Hide
        Timothy Bish added a comment -

        Easiest to apply to trunk and can't make any promises it will apply cleanly to 5.5 code. The trunk code is quite a bit improved over 5.5 release so it'd be good to test there. You will need to checkout the code and do a build via maven, which will create the broker distribution archives in the assembly/target folder. You can extract those files and get access to the built Jars.

        Show
        Timothy Bish added a comment - Easiest to apply to trunk and can't make any promises it will apply cleanly to 5.5 code. The trunk code is quite a bit improved over 5.5 release so it'd be good to test there. You will need to checkout the code and do a build via maven, which will create the broker distribution archives in the assembly/target folder. You can extract those files and get access to the built Jars.
        Hide
        William added a comment -

        ok I will try to patch 5.5.

        From what I understood from the patch, you change the synchronize by a ReentrantReadWriteLock. If the patch doesn't work I will try to modify it by myself.

        Is it possible to ask for a patch on 5.5 if it works ?

        Show
        William added a comment - ok I will try to patch 5.5. From what I understood from the patch, you change the synchronize by a ReentrantReadWriteLock. If the patch doesn't work I will try to modify it by myself. Is it possible to ask for a patch on 5.5 if it works ?
        Hide
        Timothy Bish added a comment -

        Fix applied to trunk, should show up in a SNAPSHOT build soon.

        Show
        Timothy Bish added a comment - Fix applied to trunk, should show up in a SNAPSHOT build soon.
        Hide
        William added a comment -

        I have made some tests on 5.5.0 (patch it myself) and it works better.

        But there is still a copy of the map when getDesinationMap() is called which can be heavy if you you have hundreds (or thousands) of topics and I think unnecessary since the Map is a ConcurrentHashMap that can handle concurrency if it is used properly.

        Do you plan to release 5.6.0 soon ? I'll try to make the patch on 5.5 when I have some time.

        Show
        William added a comment - I have made some tests on 5.5.0 (patch it myself) and it works better. But there is still a copy of the map when getDesinationMap() is called which can be heavy if you you have hundreds (or thousands) of topics and I think unnecessary since the Map is a ConcurrentHashMap that can handle concurrency if it is used properly. Do you plan to release 5.6.0 soon ? I'll try to make the patch on 5.5 when I have some time.
        Hide
        Gary Tully added a comment -

        Think the hashmap copy is too much.

        Show
        Gary Tully added a comment - Think the hashmap copy is too much.
        Hide
        Dejan Bosanac added a comment -

        Just a note that after a brief look, it seems that only RegionBroker.getDestinationMap() method should be refactored, so we can remove a copy

        Show
        Dejan Bosanac added a comment - Just a note that after a brief look, it seems that only RegionBroker.getDestinationMap() method should be refactored, so we can remove a copy
        Hide
        Gary Tully added a comment -

        removed the copy from AbstractRegion (Region) .getDestinationMap

        30% improvement on a simple non persistent send test with 10000 destinations.

        http://svn.apache.org/viewvc?rev=1207224&view=rev

        Show
        Gary Tully added a comment - removed the copy from AbstractRegion (Region) .getDestinationMap 30% improvement on a simple non persistent send test with 10000 destinations. http://svn.apache.org/viewvc?rev=1207224&view=rev
        Hide
        William added a comment -

        Hi guys,

        It's great you made improvement about performance by removing this unnecessary copy and thank your for taking into account my comment.

        Do you plan to make a patch for 5.5.1 ? I'd like to have something proper in order to use it production. This gain of performance can really help me...

        Or perhaps in 5.6.0, will it come out soon ?

        Futhermore, in my understanding of getDestinationMap method, it is used to determine the presence of a destination and BrokerRegion class still makes a copy of queue's map and topic's map ? Tell me if I am wrong....

        So, if I'm right about the hypothesis, can't you improve the performance again by adding a new method destinationExists(String destinationName) that just makes a containsKey in the hashmap. Thus we can also remove the copy in all the use cases in which you try to find out if a destination exists or not.

        In any case, even a get() in multiple hashmap is better than a copy, specially when the hashmap is big.

        Thanks.

        Show
        William added a comment - Hi guys, It's great you made improvement about performance by removing this unnecessary copy and thank your for taking into account my comment. Do you plan to make a patch for 5.5.1 ? I'd like to have something proper in order to use it production. This gain of performance can really help me... Or perhaps in 5.6.0, will it come out soon ? Futhermore, in my understanding of getDestinationMap method, it is used to determine the presence of a destination and BrokerRegion class still makes a copy of queue's map and topic's map ? Tell me if I am wrong.... So, if I'm right about the hypothesis, can't you improve the performance again by adding a new method destinationExists(String destinationName) that just makes a containsKey in the hashmap. Thus we can also remove the copy in all the use cases in which you try to find out if a destination exists or not. In any case, even a get() in multiple hashmap is better than a copy, specially when the hashmap is big. Thanks.
        Hide
        Gary Tully added a comment -

        I think a copy is now only created when it is needed, so in cases where the map is modified to included more destinations. But please peek at the code and the call stacks or method usage and improve it.

        Show
        Gary Tully added a comment - I think a copy is now only created when it is needed, so in cases where the map is modified to included more destinations. But please peek at the code and the call stacks or method usage and improve it.
        Hide
        Yves Rey added a comment -

        Hi,

        looks like I faced that same issue with 5.5.0. and after doing basic verifications, I thought that the problem was still on 5.6.0, thus I started fixing 5.5.0 myself.

        I just discovered about that fix in 5.6.0, but I think there is still room for improvement, thus I'd like to expose my ideas about fixing that issue:

        • I did not work on the mutexes themselves, I'm not sure how much gain we can get by using different type of mutex, I believe not much as the trouble is not really on the mutexes but rather on the time it takes to make a copy of hash maps.
        • I think the problem is rather that getDestinationMap should not be used at all in this case (I actually wonder why would it be ever needed. I did not check who uses it). AuthorizationBroker uses it only to test if a destination already exists.

        Thus I simply added a method on the base class to do that check directly on the base hash map.

        If you want I can send my patch.

        Best Regards,
        Yves.

        Show
        Yves Rey added a comment - Hi, looks like I faced that same issue with 5.5.0. and after doing basic verifications, I thought that the problem was still on 5.6.0, thus I started fixing 5.5.0 myself. I just discovered about that fix in 5.6.0, but I think there is still room for improvement, thus I'd like to expose my ideas about fixing that issue: I did not work on the mutexes themselves, I'm not sure how much gain we can get by using different type of mutex, I believe not much as the trouble is not really on the mutexes but rather on the time it takes to make a copy of hash maps. I think the problem is rather that getDestinationMap should not be used at all in this case (I actually wonder why would it be ever needed. I did not check who uses it). AuthorizationBroker uses it only to test if a destination already exists. Thus I simply added a method on the base class to do that check directly on the base hash map. If you want I can send my patch. Best Regards, Yves.
        Hide
        Gary Tully added a comment -

        @Yves, it is now ok to use org.apache.activemq.broker.region.Region#getDestinationMap b/c it does not involve a copy of the hashmap any more. It returns a direct reference to the underlying concurrent hash map.

        Show
        Gary Tully added a comment - @Yves, it is now ok to use org.apache.activemq.broker.region.Region#getDestinationMap b/c it does not involve a copy of the hashmap any more. It returns a direct reference to the underlying concurrent hash map.
        Hide
        William added a comment -

        I have integrated those changes (first mutex and then the remove of the copy of the map) in a 5.5.1 and it works a lot better under load. This may not be the best pattern by it makes ActiveMQ usable and more performant.

        I have tested the 2 fix successively and my findings are :

        • chaning mutex by read/write lock remove contentions between thread and improve performance a lot (no more red on theads in jvisualwm)
        • removing the copy of the map improve memory usage and performance

        Tested with 2000 topics on a single broker with one message every 5 seconds. Without those fixes it wasn't performant.

        Show
        William added a comment - I have integrated those changes (first mutex and then the remove of the copy of the map) in a 5.5.1 and it works a lot better under load. This may not be the best pattern by it makes ActiveMQ usable and more performant. I have tested the 2 fix successively and my findings are : chaning mutex by read/write lock remove contentions between thread and improve performance a lot (no more red on theads in jvisualwm) removing the copy of the map improve memory usage and performance Tested with 2000 topics on a single broker with one message every 5 seconds. Without those fixes it wasn't performant.
        Hide
        Yves Rey added a comment -

        Hi Gary,

        Yes, I saw your changes and I agree they will significantly reduce the contention on the mutex, but there is still a copy made in RegionBroker#getDestinationMap. That copy is now out of the mutex thus will not trigger contention, but is still much heavier than doing a hash lookup.
        Also, since you are returning a reference of the map, I believe the read lock in getDestinationMap is likely useless. And I wonder that even if it is not the case currently, some code could modify the content of the returned map outside the mutex mechanism that guarantees integrity of the list of destinations.

        Yves.

        Show
        Yves Rey added a comment - Hi Gary, Yes, I saw your changes and I agree they will significantly reduce the contention on the mutex, but there is still a copy made in RegionBroker#getDestinationMap. That copy is now out of the mutex thus will not trigger contention, but is still much heavier than doing a hash lookup. Also, since you are returning a reference of the map, I believe the read lock in getDestinationMap is likely useless. And I wonder that even if it is not the case currently, some code could modify the content of the returned map outside the mutex mechanism that guarantees integrity of the list of destinations. Yves.
        Show
        Gary Tully added a comment - further update in http://git-wip-us.apache.org/repos/asf/activemq/commit/27b3a7c3

          People

          • Assignee:
            Gary Tully
            Reporter:
            William
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development