ActiveMQ
  1. ActiveMQ
  2. AMQ-2401

Hangs in fan-in to DUPS_OK_ACKNOWLEDGE queue receivers

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.3.0
    • Fix Version/s: 5.3.0
    • Component/s: None
    • Labels:
      None
    • Environment:

      Description

      While running performance tests I I was seeing hangs in several tests involving dups ok queue receivers. My suspicion is that this is related to "too lazy" dups_ok acknowledgements. Changing the queue prefetchLimit to 100 caused this problem to go away. This needs more investigation, but it seems like we can get ourselves in to trouble if the queue size is smaller than the receiver's prefetchLimit, and this should be avoid. It is also possible that there is something more complicated happening in my tests. I haven't yet been able to reproduce this outside my performance test environment.

      1. AMQ-2401.patch
        3 kB
        Hiram Chirino
      2. AMQ2401.txt
        8 kB
        Colin MacNaughton

        Activity

        Hide
        Hiram Chirino added a comment -

        I added the test case to the build.. slightly modified so that it would exacerbate the DUPS_OK problem to get a quicker failure.
        rev: 818496
        The fix for the test case went in as rev: 818487

        Show
        Hiram Chirino added a comment - I added the test case to the build.. slightly modified so that it would exacerbate the DUPS_OK problem to get a quicker failure. rev: 818496 The fix for the test case went in as rev: 818487
        Hide
        Hiram Chirino added a comment -

        Attaching patch which changes the DUPS_OK queue consumer behavior to match that of AUTO_ACK. The topic should still behave as before.

        Show
        Hiram Chirino added a comment - Attaching patch which changes the DUPS_OK queue consumer behavior to match that of AUTO_ACK. The topic should still behave as before.
        Hide
        Hiram Chirino added a comment -

        I think the safest bet will be to change the client so that he acks more frequently at least in a queue dups ok case.

        Show
        Hiram Chirino added a comment - I think the safest bet will be to change the client so that he acks more frequently at least in a queue dups ok case.
        Hide
        Colin MacNaughton added a comment -

        After some changes committed by Rob and Dejan in 5.4 trunk, the 50_1_1 usecase is better than before. However, setting the pending queue policy to vm results in the old buggy behavior, which is problematic for those thar don't wish to page messages to disk:

        Modify setup to include vm pending cursor,java
        entry.setPendingQueuePolicy(new VMPendingQueueMessageStoragePolicy());
        
        Show
        Colin MacNaughton added a comment - After some changes committed by Rob and Dejan in 5.4 trunk, the 50_1_1 usecase is better than before. However, setting the pending queue policy to vm results in the old buggy behavior, which is problematic for those thar don't wish to page messages to disk: Modify setup to include vm pending cursor,java entry.setPendingQueuePolicy( new VMPendingQueueMessageStoragePolicy());
        Hide
        Colin MacNaughton added a comment -

        The attached unit test can be used to demonstrate two issues. If you set PRODUCER_COUNT=1 and CONSUMER_COUNT=1, and change the producer message size to 1300 bytes. You can see that this will indeed produce a hang since the consumer's prefetch is 1000 and less than number fit into the queue.

        However, using the test as is unmodified (e.g. 50 producers, and a message size of 1024 doesn't result in a hang, but does result in a serious performance bottleneck. There is some sort of contention happening in the broker. This is the behavior that I was seeing in my original performance runs ... simply a sever slow down.

        Show
        Colin MacNaughton added a comment - The attached unit test can be used to demonstrate two issues. If you set PRODUCER_COUNT=1 and CONSUMER_COUNT=1, and change the producer message size to 1300 bytes. You can see that this will indeed produce a hang since the consumer's prefetch is 1000 and less than number fit into the queue. However, using the test as is unmodified (e.g. 50 producers, and a message size of 1024 doesn't result in a hang, but does result in a serious performance bottleneck. There is some sort of contention happening in the broker. This is the behavior that I was seeing in my original performance runs ... simply a sever slow down.
        Hide
        Colin MacNaughton added a comment -

        Path file for Unit Test Reproducing the issue.

        Show
        Colin MacNaughton added a comment - Path file for Unit Test Reproducing the issue.

          People

          • Assignee:
            Unassigned
            Reporter:
            Colin MacNaughton
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development