Uploaded image for project: 'Qpid'
  1. Qpid
  2. QPID-4285

HA backups continuously disconnect / re-sync after attempting to replicate a deleted queue

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.18
    • 0.19
    • C++ Broker
    • None

    Description

      Running qmf-stat on the primary broker shows that auto-delete queue XYZ exists, but running drain against the queue indicates that the queue does not really exist. QMF is out-of-sync with the true state of the queue and as a result, the deleted queue was replicated to the backup broker. When the backup attempted to subscribe to the queue, it received an error that the queue was deleted which results in the backup disconnecting/reconnecting to the primary and re-attempting the state replication.

      Sample log output from backup:

      Sep 4 14:51:26 itcm13 qpidd[10392]: 2012-09-04 14:51:26 [System] error resource-deleted: Queue XYZ has been deleted. (qpid/broker/Queue.cpp:1787)
      Sep 4 14:51:26 itcm13 qpidd[10392]: 2012-09-04 14:51:26 [Broker] info Inter-broker link disconnected from 10.3.100.105:9006 Closed by peer
      Sep 4 14:51:28 itcm13 qpidd[10392]: 2012-09-04 14:51:28 [System] info Connecting: 10.3.100.105:9006

      Sample log output from the primary:

      Sep 4 14:18:15 system-node1a-cluster qpidd[8397]: 2012-09-04 14:18:15 [HA] debug Primary: Known backup connected: host3:9006(ready)
      Sep 4 14:18:15 system-node1a-cluster qpidd[8397]: 2012-09-04 14:18:15 [HA] debug Broker: Membership add: host3:9006(ready)
      Sep 4 14:18:15 system-node1a-cluster qpidd[8397]: 2012-09-04 14:18:15 [HA] info Broker: Membership changed: host3:9006(ready) system-node1a-cluster:9006(recovering)
      Sep 4 14:18:16 system-node1a-cluster qpidd[8397]: 2012-09-04 14:18:16 [System] debug DISCONNECTED [10.3.100.105:9006-10.3.100.13:19841]
      Sep 4 14:18:16 system-node1a-cluster qpidd[8397]: 2012-09-04 14:18:16 [HA] debug Primary: Backup disconnected: host3:9006(ready)
      Sep 4 14:18:16 system-node1a-cluster qpidd[8397]: 2012-09-04 14:18:16 [HA] debug Broker: Membership remove: 4ddb3222-e66d-4e6a-8c87-14e1a37332cf
      Sep 4 14:18:16 system-node1a-cluster qpidd[8397]: 2012-09-04 14:18:16 [HA] info Broker: Membership changed: system-node1a-cluster:9006(recovering)

      Backup bracktrace:

      #0 qpid::broker::Queue::checkNotDeleted (this=0x4d97f70, c=<value optimized out>) at qpid/broker/Queue.cpp:1787
      #1 0x0000003e1dbf67e4 in qpid::broker::Queue::getNextMessage (this=0x4d97f70, m=..., c=...) at qpid/broker/Queue.cpp:385
      #2 0x0000003e1dbf688e in qpid::broker::Queue::dispatch (this=<value optimized out>, c=...) at qpid/broker/Queue.cpp:510
      #3 0x00007f15139e62ba in qpid::ha::ReplicatingSubscription::getNext (q=..., from=..., result=...) at qpid/ha/ReplicatingSubscription.cpp:116
      #4 0x00007f15139e4ebe in qpid::ha::QueueReplicator::initializeBridge (this=0x3868af0, bridge=..., sessionHandler=<value optimized out>) at qpid/ha/QueueReplicator.cpp:121

      Attachments

        Activity

          People

            aconway Alan Conway
            dillaman Jason Dillaman
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: