Uploaded image for project: 'Qpid'
  1. Qpid
  2. QPID-2295

Clustered broker crashes with inconsistency error

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.6
    • Fix Version/s: 0.6
    • Component/s: C++ Broker
    • Labels:
      None
    • Environment:

      RHEL 5.4 on VMWare, 32 bit

      Description

      When running two brokers in a cluster, one of them will fail with a message like the following:

      2009-12-17 14:04:23 error qpid/amqp_0_10/SessionHandler.cpp:97: Execution exception: invalid-argument: na1mdev1@QPID.f16efd17-e55d-4a40-b9b0-c506a4df5f3e: confirmed < (805+0) but only sent < (800+0) (qpid/SessionState.cpp:151)
      2009-12-17 14:04:23 critical qpid/cluster/ErrorCheck.cpp:87: cluster(10.59.174.211:3060 READY/error) local error 126323 did not occur on member 10.59.174.186:32003: invalid-argument: na1mdev1@QPID.f16efd17-e55d-4a40-b9b0-c506a4df5f3e: confirmed < (805+0) but only sent < (800+0) (qpid/SessionState.cpp:151)
      2009-12-17 14:04:23 error ./qpid/cluster/PollableQueue.h:58: Error delivering frames: local error did not occur on all cluster members : invalid-argument: na1mdev1@QPID.f16efd17-e55d-4a40-b9b0-c506a4df5f3e: confirmed < (805+0) but only sent < (800+0) (qpid/SessionState.cpp:151) (qpid/cluster/ErrorCheck.cpp:89)
      2009-12-17 14:04:23 notice qpid/cluster/Cluster.cpp:369: cluster(10.59.174.211:3060 LEFT/error) leaving cluster na1m-dev1
      2009-12-17 14:04:23 notice qpid/broker/Broker.cpp:340: Shut down
      ...
      2009-12-17 14:25:00 error qpid/amqp_0_10/SessionHandler.cpp:97: Execution exception: invalid-argument: na1mdev1@QPID.28265958-4239-4fdf-8772-f6630a7f01fe: confirmed < (742+0) but only sent < (729+0) (qpid/SessionState.cpp:151)
      2009-12-17 14:25:00 critical qpid/cluster/ErrorCheck.cpp:87: cluster(10.59.174.211:3874 READY/error) local error 151214 did not occur on member 10.59.174.186:32003: invalid-argument: na1mdev1@QPID.28265958-4239-4fdf-8772-f6630a7f01fe: confirmed < (742+0) but only sent < (729+0) (qpid/SessionState.cpp:151)
      2009-12-17 14:25:00 error ./qpid/cluster/PollableQueue.h:58: Error delivering frames: local error did not occur on all cluster members : invalid-argument: na1mdev1@QPID.28265958-4239-4fdf-8772-f6630a7f01fe: confirmed < (742+0) but only sent < (729+0) (qpid/SessionState.cpp:151) (qpid/cluster/ErrorCheck.cpp:89)
      2009-12-17 14:25:00 notice qpid/cluster/Cluster.cpp:369: cluster(10.59.174.211:3874 LEFT/error) leaving cluster na1m-dev1
      2009-12-17 14:25:00 notice qpid/broker/Broker.cpp:340: Shut down

      qpidd.conf looks like this:

      data-dir=/usr/local/adobe/qpid-data
      #no-module-dir=1
      #load-module=/usr/local/adobe/qpid/lib/qpid/daemon/cluster.so
      #load-module=/usr/local/adobe/qpid/lib/qpid/daemon/msgstore.so
      cluster-mechanism=PLAIN
      #cluster-mechanism=ANONYMOUS
      cluster-name="na1m-dev1"
      cluster-username="na1mdev1"
      cluster-password="***"
      log-to-file=/usr/local/adobe/qpid-data/qpidd.log
      log-source=1
      trace=0
      auth=1
      #acl-file=/usr/local/adobe/qpid-data/qpidd.acl
      wait=60

      The client is the Java JMS client from 0.6beta1. The connection strings are as follows:

      connectionfactory.qpidConnectionFactory = amqp://na1mdev1:***@test/?brokerlist='tcp://na1m-dev1.services.adobe.com:443?ssl='true',connectdelay='1000',connecttimeout='5000'',failover='roundrobin?cyclecount='999''

      destination.recvQueue=direct:////test-prattrs?durable='true'
      destination.sendQueue=direct:////test-prattrs?durable='true'

      Qpid was built from trunk at the following revision:

      [prattrs@localhost qpid]$ svn info
      Path: .
      URL: http://svn.apache.org/repos/asf/qpid/trunk/qpid
      Repository Root: http://svn.apache.org/repos/asf
      Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
      Revision: 891406
      Node Kind: directory
      Schedule: normal
      Last Changed Author: aconway
      Last Changed Rev: 891395
      Last Changed Date: 2009-12-16 11:31:20 -0800 (Wed, 16 Dec 2009)

      The messaging store was built from trunk at the following revision:

      [prattrs@localhost cpp]$ svn info
      Path: .
      URL: http://anonsvn.jboss.org/repos/rhmessaging/store/trunk/cpp
      Repository Root: http://anonsvn.jboss.org/repos/rhmessaging
      Repository UUID: 06e15bec-b515-0410-bef0-cc27a458cf48
      Revision: 3750
      Node Kind: directory
      Schedule: normal
      Last Changed Author: kpvdr
      Last Changed Rev: 3747
      Last Changed Date: 2009-12-16 10:24:45 -0800 (Wed, 16 Dec 2009)

      I'll work on packaging up driver code for external consumption. Basically though, what I'm doing to drive qpid is establishing 8 send and 8 receiving connections, with either a producer or consumer on each one, and a separate thread for each. Message bodies are randomly generated 1000 character strings. Each sender thread sends 1000 or these messages, for a total of 8000 messages sent. All the messages get through because the primary survives in good health. The total length of the sending period for each of the above failures was about 20 seconds, at a rate of about 450 messages per second.

        Attachments

        1. qpid-driver.diff
          4 kB
          Alan Conway
        2. qpid-driver.zip
          4.84 MB
          Sandy Pratt

          Activity

            People

            • Assignee:
              aconway Alan Conway
              Reporter:
              prattrs Sandy Pratt
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: