ActiveMQ
  1. ActiveMQ
  2. AMQ-2803 'Zombie' messages created in KahaDB after failover, with warning 'Duplicate message add attempt rejected.'
  3. AMQ-2542

Tidy up store duplicate suppression from failover recovery - consistent store implementation with help from transportConnection

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.3.0
    • Fix Version/s: 5.4.0
    • Component/s: Broker
    • Labels:
      None

      Description

      With failover - with a failure before a reply is received to a send or a transaction commit, the send or transaction will be replayed and will be a duplicate.
      The transportConnector should know that recovery/reconnection has happened and should enforce duplicate suppression based on obtaining the last producer sequence number from the store.
      Currently, duplicate suppression happens at the jdbc message store add, amq store reference store etc... it is not consistent and it is abased on a suitable audit window which may be non deterministic. Would be best to be fully deterministic and consistent (as in a single persistence adapter api)

      To make this perform, a transportConnection needs to be flagged as a reconnect which can precipitate the duplicate suppression, possibly needing a wireformat update. This flag would also help with unmatched acks after failover but maybe that flag can be in an ack...

      This is relevant both when the broker is restarted or when a connection is dropped.

      related issue with relevant test : https://issues.apache.org/activemq/browse/AMQ-2540

        Issue Links

          Activity

          Hide
          Gary Tully added a comment -

          flag in the producerInfo - set by the failover replay logic.
          Some further justification: with many concurrent connections, first recovers fine and producers messages with a unique producerId per message, this can easily exhaust the recovered message audit from the store. This makes the store audit a bit useless (unless the maxProducersToAudit is very large) and points to the fact tat this is the wrong place to do duplicate suppression in this case.

          Show
          Gary Tully added a comment - flag in the producerInfo - set by the failover replay logic. Some further justification: with many concurrent connections, first recovers fine and producers messages with a unique producerId per message, this can easily exhaust the recovered message audit from the store. This makes the store audit a bit useless (unless the maxProducersToAudit is very large) and points to the fact tat this is the wrong place to do duplicate suppression in this case.
          Hide
          Gary Tully added a comment -

          Fixing this issue will remove the need for the excessive duplicate suppression in the jdbc store that causes https://issues.apache.org/activemq/browse/AMQ-2800

          Show
          Gary Tully added a comment - Fixing this issue will remove the need for the excessive duplicate suppression in the jdbc store that causes https://issues.apache.org/activemq/browse/AMQ-2800
          Hide
          Gary Tully added a comment -

          resolved in 961783

          Implemented for KahaDB and JDBC.

          Show
          Gary Tully added a comment - resolved in 961783 Implemented for KahaDB and JDBC.

            People

            • Assignee:
              Gary Tully
              Reporter:
              Gary Tully
            • Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development