Uploaded image for project: 'ActiveMQ Classic'
  1. ActiveMQ Classic
  2. AMQ-7028

Poor performance when concurrentStoreAndDispatchQueues + slow FS + Slow Consumers

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 5.15.4
    • None
    • KahaDB
    • None
    • Patch Available
    • Patch

    Description

      Using high latency FS (as NFS) to store kahadb files and setting concurrentStoreAndDispatchQueues=true may cause poor performance for slow consumer. This happens because using this option makes activemq write the produced messages one by one to the underlying file system (this is implemented by using a SingleThread ExecutorService).

      Lets say that for each write to the FS takes 10ms and the queue has slow consumers. In this case, does not matter the number of concurrent messages the producers try to send to the queue, the maximum performance we can achieve is 100 TPS. Tuning this flag off, we can see a really better performance for sending messages in parallel as those messages can be batched to the FS in a single write (the performance increases with the number of concurrent messages being sent in parallel).

      Looking at Activemq code we found that there is an flag used on levelDb to detect if the queue has fast or slow consumers, and decide if it will use concurrentStoreAndDispach or not.

      https://issues.apache.org/jira/browse/AMQ-3750

      but this flag is not used on the KahaDb implementation.

      We made a code change to receive the flag in the KahaDbStore and use it to decide if the message will be stored async or not.

      We think that there is no reason to try to "StoreAndDispatch" if the destination has slow consumers. This only brings overhead and in case of high latency FS, really poor performance when the queue has slow consumer.

      For fast consumers, this change will have no effect giving the better of the 2 options.

      Some Results:

      Original Version:

      Fast Consumers:

      Producer
      mean rate = 8248.50 calls/second
      min = 0.42 milliseconds
      max = 756.61 milliseconds
      mean = 11.30 milliseconds
      stddev = 44.05 milliseconds
      median = 6.02 milliseconds
      75% <= 9.79 milliseconds
      95% <= 18.15 milliseconds
      98% <= 27.71 milliseconds
      99% <= 123.51 milliseconds
      99.9% <= 756.61 milliseconds

      Slow consumers:

      Producer
      mean rate = 84.29 calls/second
      min = 86.27 milliseconds
      max = 1467.53 milliseconds
      mean = 1082.55 milliseconds
      stddev = 154.04 milliseconds
      median = 1075.94 milliseconds
      75% <= 1169.10 milliseconds
      95% <= 1308.90 milliseconds
      98% <= 1350.85 milliseconds
      99% <= 1363.61 milliseconds
      99.9% <= 1466.67 milliseconds

      Patched Version:

      Fast Consumers:

      Producer
      count = 890783
      mean rate = 8099.33 calls/second
      min = 0.47 milliseconds
      max = 2259.10 milliseconds
      mean = 13.90 milliseconds
      stddev = 84.84 milliseconds
      median = 5.00 milliseconds
      75% <= 9.08 milliseconds
      95% <= 15.66 milliseconds
      98% <= 32.94 milliseconds
      99% <= 355.52 milliseconds
      99.9% <= 731.69 milliseconds

      Slow consumers:

      Producer
      mean rate = 1732.25 calls/second
      1-minute rate = 1811.80 calls/second
      min = 17.52 milliseconds
      max = 1249.54 milliseconds
      mean = 50.95 milliseconds
      stddev = 130.68 milliseconds
      median = 28.73 milliseconds
      75% <= 32.51 milliseconds
      95% <= 57.04 milliseconds
      98% <= 461.46 milliseconds
      99% <= 937.87 milliseconds
      99.9% <= 1249.48 milliseconds

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              alanprot Alan Protasio
              Votes:
              2 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated: