Uploaded image for project: 'Qpid'
  1. Qpid
  2. QPID-5974

HA qpid-txtest2 can bring down a cluster (JERR_MAP_LOCKED)

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.28
    • Fix Version/s: 0.31
    • Component/s: C++ Clustering
    • Labels:
      None

      Description

      Description of problem:

      qpid-txtest2 AMQP0-10 transactional & durable transfer operation can bring down whole qpid HA. Note no brokers were killed, just the txtest was run.

      To reproduce:

      3 node cluster
      whlie qpid-txtest2 -b 20.0.20.200 --tx-count 500 --queues 10 --messages-per-tx 10 --total-messages 1000 --durable 1

      Result:

      Test fails. Broker logs show critical and error messages like this:

      [root@dhcp-lab-A ~]# grep -E 'error|critical' ~qpidd/qpidd.log
      2014-07-24 14:10:33 [Protocol] error Connection qpid.192.168.6.246:5672-192.168.6.247:34210 timed out: closing
      
      
      [root@dhcp-lab-B ~]# grep -E 'error|critical' ~qpidd/qpidd.log
      2014-07-24 14:10:23 [HA] critical Shutting down: Backup of tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue() failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a pending transaction. (drid=0x6da3) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
      2014-07-24 14:10:23 [Protocol] error Connection qpid.ha.link.09e80392-0c79-4239-a1d0-ea5b53c71bd9 closed by error: Backup of tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue() failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a pending transaction. (drid=0x6da3) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
      2014-07-24 14:10:24 [Broker] error Could not find dequeued message on commit
      2014-07-24 14:10:24 [HA] error Backup of transaction 00648954: Destroyed prematurely, rollback
      2014-07-24 14:10:24 [HA] error Backup of transaction 2f556197: Destroyed prematurely, rollback
      2014-07-24 14:10:24 [HA] error Backup of transaction 5bd58ffe: Destroyed prematurely, rollback
      2014-07-24 14:10:24 [HA] error Backup of transaction 5d34703c: Destroyed prematurely, rollback
      2014-07-24 14:10:24 [HA] error Backup of transaction 7e93a7ea: Destroyed prematurely, rollback
      2014-07-24 14:10:24 [HA] error Backup of transaction e8856f6f: Destroyed prematurely, rollback
      2014-07-24 14:10:38 [HA] critical Shutting down: Backup of tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue() failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a pending transaction. (drid=0x7a42) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
      2014-07-24 14:10:38 [Protocol] error Connection qpid.ha.link.0fc6bd3c-48c2-4b27-9db3-2742b3ddc835 closed by error: Backup of tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue() failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a pending transaction. (drid=0x7a42) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
      2014-07-24 14:10:38 [Broker] error Could not find dequeued message on commit
      2014-07-24 14:10:38 [HA] critical Shutting down: Backup of tx-test2-10: Replication failed: Queue tx-test2-10: async_dequeue() failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a pending transaction. (drid=0x7a43) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
      2014-07-24 14:10:38 [Protocol] error Connection qpid.ha.link.0fc6bd3c-48c2-4b27-9db3-2742b3ddc835 closed by error: Backup of tx-test2-10: Replication failed: Queue tx-test2-10: async_dequeue() failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a pending transaction. (drid=0x7a43) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
      2014-07-24 14:10:38 [Broker] error Could not find dequeued message on commit
      2014-07-24 14:10:40 [HA] critical Shutting down: Backup of tx-test2-7: Replication failed: Queue tx-test2-7: async_dequeue() failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a pending transaction. (drid=0x7a49) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
      2014-07-24 14:10:40 [Protocol] error Connection qpid.ha.link.0fc6bd3c-48c2-4b27-9db3-2742b3ddc835 closed by error: Backup of tx-test2-7: Replication failed: Queue tx-test2-7: async_dequeue() failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a pending transaction. (drid=0x7a49) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
      2014-07-24 14:10:40 [Broker] error Could not find dequeued message on commit
      2014-07-24 14:11:10 [HA] error Backup: Joining active cluster, cannot be promoted.
      
      
      [root@dhcp-lab-C ~]# grep -E 'error|critical' ~qpidd/qpidd.log
      2014-07-24 14:10:23 [HA] critical Shutting down: Backup of tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue() failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a pending transaction. (drid=0x53a3) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
      2014-07-24 14:10:23 [Protocol] error Connection qpid.ha.link.1bb57f0a-48db-460c-9260-0f5b353e4bd1 closed by error: Backup of tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue() failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a pending transaction. (drid=0x53a3) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
      2014-07-24 14:10:24 [Broker] error Could not find dequeued message on commit
      2014-07-24 14:10:24 [HA] error Backup of transaction 00648954: Destroyed prematurely, rollback
      2014-07-24 14:10:24 [HA] error Backup of transaction 2f556197: Destroyed prematurely, rollback
      2014-07-24 14:10:24 [HA] error Backup of transaction 5bd58ffe: Destroyed prematurely, rollback
      2014-07-24 14:10:24 [HA] error Backup of transaction 5d34703c: Destroyed prematurely, rollback
      2014-07-24 14:10:24 [HA] error Backup of transaction 7e93a7ea: Destroyed prematurely, rollback
      2014-07-24 14:10:24 [HA] error Backup of transaction e8856f6f: Destroyed prematurely, rollback
      2014-07-24 14:10:35 [HA] error Backup of transaction 243b4279: Destroyed prematurely, rollback
      2014-07-24 14:10:35 [HA] error Backup of transaction 4f4a25df: Destroyed prematurely, rollback
      2014-07-24 14:10:35 [HA] error Backup of transaction 80cbe9af: Destroyed prematurely, rollback
      2014-07-24 14:10:35 [HA] error Backup of transaction a3ed917a: Destroyed prematurely, rollback
      2014-07-24 14:10:35 [HA] error Backup of transaction b7a4b9a0: Destroyed prematurely, rollback
      2014-07-24 14:10:35 [HA] error Backup of transaction b9ba9995: Destroyed prematurely, rollback
      2014-07-24 14:10:35 [HA] error Backup of transaction cbd0d6bf: Destroyed prematurely, rollback
      2014-07-24 14:10:35 [HA] error Backup of transaction e127288a: Destroyed prematurely, rollback
      2014-07-24 14:10:35 [HA] error Backup of transaction eb43e683: Destroyed prematurely, rollback
      2014-07-24 14:10:35 [HA] error Backup of transaction f29196c1: Destroyed prematurely, rollback
      2014-07-24 14:10:53 [HA] error Backup: Still catching up, cannot be promoted.
      
      

        Attachments

          Activity

            People

            • Assignee:
              aconway Alan Conway
              Reporter:
              aconway Alan Conway
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: