Uploaded image for project: 'ActiveMQ'
  1. ActiveMQ
  2. AMQ-4924

Duplicate messages are left in the persistence store

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 5.8.0, 5.9.0
    • Fix Version/s: 5.9.1, 5.10.0
    • Component/s: Broker
    • Labels:
      None

      Description

      We have a local and remote broker connected with a duplex bridge, which is initiated by the remote broker.
      Producers are attached to the remote broker, one consumer to the local broker.
      The following scenario causes messages to be left in the local store, which are replayed when the local broker is restarted:

      1. messages are forwarded from the remote broker to the local broker
      2. messages are dispatched to the local consumer
      3. the connection between the local and remote broker fails
      4. the local broker tries to acknowledge the message reception to the remote broker, which fails
      5. the remote broker reconnects
      6. the messages are resent
      7. the local broker correctly identifies them as duplicates, but puts them into the store nevertheless where they remain until the local broker is restarted
      8. other messages are produced and consumed without a problem
      9. the local broker is restarted
      10. the duplicates are now delivered to the local consumer again and of course out of order

      This behaviour can be identified by a queue size which does not seem to shrink below a certain number, even if a consumer is connected and consuming other messages.

      When the log level is set to TRACE these messages indicate the problem:

      2013-12-06 20:35:17,405 TRACE .a.a.b.r.c.AbstractStoreCursor - org.apache.activemq.broker.region.cursors.QueueStorePrefetch@c0bc4f:testqueue,batchResetNeeded=false,storeHasMessages=true,size=0,cacheEnabled=true,maxBatchSize:1 - cursor got duplicate: ID:smcexp5-58011-1386358514283-7:1:1:1:1, 4 [ActiveMQ VMTransport: vm://LOCAL#19-1]
      2013-12-06 20:35:17,412 TRACE .a.a.b.r.c.AbstractStoreCursor - org.apache.activemq.broker.region.cursors.QueueStorePrefetch@c0bc4f:testqueue,batchResetNeeded=false,storeHasMessages=false,size=1,cacheEnabled=false,maxBatchSize:1 - fillBatch [ActiveMQ BrokerService[LOCAL] Task-2]
      
      1. AMQ4924.java
        12 kB
        Ron Koerner

        Issue Links

          Activity

          Hide
          ron.koerner Ron Koerner added a comment -

          Attached test case to reproduce the problem. It can be reproduced with KahaDB or LevelDB.

          Show
          ron.koerner Ron Koerner added a comment - Attached test case to reproduce the problem. It can be reproduced with KahaDB or LevelDB.
          Hide
          ron.koerner Ron Koerner added a comment -

          Attached new version, where the local consumer is disconnected and reconnected to exclude prefetch issues.

          Show
          ron.koerner Ron Koerner added a comment - Attached new version, where the local consumer is disconnected and reconnected to exclude prefetch issues.
          Hide
          rajdavies Rob Davies added a comment -

          Duplicates where not being checked in duplex connections.
          There is an additional property on NetworkConnector that needs to be enabled - checkDuplicateMessagesOnDuplex (default is false).

          Thank you Ron for investigating this!

          Show
          rajdavies Rob Davies added a comment - Duplicates where not being checked in duplex connections. There is an additional property on NetworkConnector that needs to be enabled - checkDuplicateMessagesOnDuplex (default is false). Thank you Ron for investigating this!
          Hide
          ron.koerner Ron Koerner added a comment - - edited

          It also seems to help to enable "supportFailOver" in the broker.

          I don't know about the individual advantages or disadvantages for each solution. Can you elaborate? Is there a reason not to activate supportFailOver or checkDuplicateMessagesOnDuplex permanently/by default?

          Show
          ron.koerner Ron Koerner added a comment - - edited It also seems to help to enable "supportFailOver" in the broker. I don't know about the individual advantages or disadvantages for each solution. Can you elaborate? Is there a reason not to activate supportFailOver or checkDuplicateMessagesOnDuplex permanently/by default?
          Hide
          ron.koerner Ron Koerner added a comment - - edited

          This also seemst to be a duplicate of https://issues.apache.org/jira/browse/AMQ-3473.
          Unfortunately setting auditNetworkProducers = true does not suppress the extra message in the store.

          Show
          ron.koerner Ron Koerner added a comment - - edited This also seemst to be a duplicate of https://issues.apache.org/jira/browse/AMQ-3473 . Unfortunately setting auditNetworkProducers = true does not suppress the extra message in the store.
          Hide
          ron.koerner Ron Koerner added a comment -

          After looking at your change, I wondered if the problem also occurs without a duplex bridge and reran my test. It does happen without duplex. I also may have misunderstood your code, but I was wondering how the remote side could know which messages need to be resent, as sometimes messages could get lost and other times just the response/ack gets lost (like in my case).

          Show
          ron.koerner Ron Koerner added a comment - After looking at your change, I wondered if the problem also occurs without a duplex bridge and reran my test. It does happen without duplex. I also may have misunderstood your code, but I was wondering how the remote side could know which messages need to be resent, as sometimes messages could get lost and other times just the response/ack gets lost (like in my case).
          Hide
          gtully Gary Tully added a comment -

          @Ron supportFailOver is really old code that has no tests and I am not aware of any user of it. It duplicates work done elsewhere. So don't go down that road. see: https://issues.apache.org/jira/browse/AMQ-4929

          Show
          gtully Gary Tully added a comment - @Ron supportFailOver is really old code that has no tests and I am not aware of any user of it. It duplicates work done elsewhere. So don't go down that road. see: https://issues.apache.org/jira/browse/AMQ-4929
          Hide
          rajdavies Rob Davies added a comment -

          There's another variable to set on the TransportConnector - auditNetworkProducers (disabled by default) - for regular networks. Messages are always available to be resent unless they've been acknowledged as being consumed, so its really only duplicate messages that we need to worry about.

          Show
          rajdavies Rob Davies added a comment - There's another variable to set on the TransportConnector - auditNetworkProducers (disabled by default) - for regular networks. Messages are always available to be resent unless they've been acknowledged as being consumed, so its really only duplicate messages that we need to worry about.
          Hide
          ron.koerner Ron Koerner added a comment -

          So if I have no duplex connection I need to activate auditNetworkProducer on the receiving TransportConnector (in my case "LOCAL") and if I have a duplex connection I need to activate checkDuplicateMessagesOnDuplex on the sending NetworkConnector (in my case "REMOTE").
          But when messages are sent from REMOTE to LOCAL the DemandForwardingBridge on the LOCAL side will actually check for the duplicate, even if it is configured only on the REMOTE side.
          As soon as 5.10 is released I will change my setup accordingly. For now I can use supportFailOver as a workaround.

          Thank you very much!

          Other people with this problem may appreciate a hint to use auditNetworkProducers or checkDuplicateMessagesOnDuplex whenever they encounter the situation that there is a duplicate detected by the cursor.

          Show
          ron.koerner Ron Koerner added a comment - So if I have no duplex connection I need to activate auditNetworkProducer on the receiving TransportConnector (in my case "LOCAL") and if I have a duplex connection I need to activate checkDuplicateMessagesOnDuplex on the sending NetworkConnector (in my case "REMOTE"). But when messages are sent from REMOTE to LOCAL the DemandForwardingBridge on the LOCAL side will actually check for the duplicate, even if it is configured only on the REMOTE side. As soon as 5.10 is released I will change my setup accordingly. For now I can use supportFailOver as a workaround. Thank you very much! Other people with this problem may appreciate a hint to use auditNetworkProducers or checkDuplicateMessagesOnDuplex whenever they encounter the situation that there is a duplicate detected by the cursor.

            People

            • Assignee:
              rajdavies Rob Davies
              Reporter:
              ron.koerner Ron Koerner
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development