Qpid
  1. Qpid
  2. QPID-3896

Broker crash when using auto delete queues in a cluster

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.17
    • Fix Version/s: 0.16
    • Component/s: C++ Clustering
    • Labels:
      None

      Description

      I can get the broker to crash with a simple configuration involving multiple 'auto delete' queues.

      The following client pseudo code can cause the crash:

      loc_sess = [ ];

      1. create couple of sessions
        for i in range(in_loops):
        queue_durability = False;
        if (i % 2 == 1):
        queue_durability = True;
      2. create new local session[s]
        lsess = self.connection.session(loc_sess_name % i);
      3. delete the queue (if needed)
        self.cleanup(in_queue=loc_q_name % i);
      4. declare auto-delete queue[s]
        lsess.queue_declare(queue=loc_q_name % i,
        auto_delete=True,
        arguments= {"qpid.auto_delete_timeout" : q_timeout}

        ,
        durable=queue_durability);

      1. check that queue[s] is still available
        result = lsess.queue_query(queue=loc_q_name % i);
        self.assertEqual(loc_q_name % i, result.queue);
      1. bind queue to exchange amf.fanout
        lsess.exchange_bind(exchange=e_name,
        queue=loc_q_name % i,
        binding_key=f_name);
      1. append the session to list
        loc_sess.append(lsess);
      1. send messages to the queues via amq.fanout
        dp = sess.delivery_properties(routing_key=f_name);
        msg_cnt = random.randint(*MSG_CNT_RR);
        print "setup: in_loops:%d, msg_cnt:%d" % (in_loops, msg_cnt);
        for j in range(msg_cnt):
        sess.message_transfer(destination=e_name,
        message=qpid.datatypes.Message(dp, msg_layout % j));
      1. check that queues contain correct message count via QMF
        self.startQmf();
        for i in range(in_loops):
        sq = self.qmf_session.getObjects(_class="queue", name=loc_q_name % i)[0];
        self.assertEqual (sq.msgDepth, msg_cnt);
      1. receive one (first) message from the queues
        for i in range(in_loops):
        loc_sess[i].message_subscribe(destination="dlq", queue=loc_q_name % i);
        loc_sess[i].message_flow(destination="dlq", value=0xFFFFFFFFL,
        unit=loc_sess[i].credit_unit.message)
        loc_sess[i].message_flow(destination="dlq", value=0xFFFFFFFFL,
        unit=loc_sess[i].credit_unit.byte)
        dlq = loc_sess[i].incoming("dlq");

      msg=dlq.get(timeout=1);
      self.assertEqual(msg_layout % 0, msg.body);

      1. check that queues are present at this point (subscription still alive atm)
        for i in range(in_loops):
      2. browse sessions
        result = loc_sess[i].queue_query(queue=loc_q_name % i);
        self.assertEqual(loc_q_name % i, result.queue);

      loc_sess[i].close();

      1. check that queues are still available (after local sessions JUST closed)
        for i in range(in_loops):
      2. browse sessions
        result = sess.queue_query(queue=loc_q_name % i);
        self.assertEqual(loc_q_name % i, result.queue);

      print "sleeping - waiting for queue auto timeout"
      time.sleep(q_timeout+AD_TIMEOUT_TOL);

      1. check whether queue has been deleted (expected to be deleted)
        for i in range(in_loops):
        result = sess.queue_query(queue=loc_q_name % i);
        self.assert_(not result.queue);

      Analysis:

      The ClusterTimer is unable to handle storing two timer tasks that have the same name. The Queue code creates a timer task for each auto delete queue. These tasks all have the same name "DelayedAutoDeletion". This causes ClusterTimer::add() to throw an exception as it thinks there are duplicate timer tasks.

        Activity

        Hide
        Justin Ross added a comment -

        Reviewed by Alan. Approved for 0.16.

        Show
        Justin Ross added a comment - Reviewed by Alan. Approved for 0.16.
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4413/#review6117
        -----------------------------------------------------------

        Ship it!

        • Alan

        On 2012-03-20 14:06:07, Kenneth Giusti wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4413/

        -----------------------------------------------------------

        (Updated 2012-03-20 14:06:07)

        Review request for qpid and Alan Conway.

        Summary

        -------

        When an auto-delete queue is created, a timer is used to detect when the queue should be deleted. Since the timer is managed by the queue, the name of the timer is unimportant, and the queue code uses the same name for all timers.

        This creates a problem for the clustering code, which must monitor each timer. This patch changes the queue code to create a timer with a name based on the name of the queue. This allows the cluster code to correctly distinguish these timers.

        This addresses bug qpid-3896.

        https://issues.apache.org/jira/browse/qpid-3896

        Diffs

        -----

        /trunk/qpid/cpp/src/qpid/broker/Queue.cpp 1302629

        /trunk/qpid/cpp/src/qpid/cluster/ClusterTimer.cpp 1302629

        Diff: https://reviews.apache.org/r/4413/diff

        Testing

        -------

        unit

        Thanks,

        Kenneth

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4413/#review6117 ----------------------------------------------------------- Ship it! Alan On 2012-03-20 14:06:07, Kenneth Giusti wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4413/ ----------------------------------------------------------- (Updated 2012-03-20 14:06:07) Review request for qpid and Alan Conway. Summary ------- When an auto-delete queue is created, a timer is used to detect when the queue should be deleted. Since the timer is managed by the queue, the name of the timer is unimportant, and the queue code uses the same name for all timers. This creates a problem for the clustering code, which must monitor each timer. This patch changes the queue code to create a timer with a name based on the name of the queue. This allows the cluster code to correctly distinguish these timers. This addresses bug qpid-3896. https://issues.apache.org/jira/browse/qpid-3896 Diffs ----- /trunk/qpid/cpp/src/qpid/broker/Queue.cpp 1302629 /trunk/qpid/cpp/src/qpid/cluster/ClusterTimer.cpp 1302629 Diff: https://reviews.apache.org/r/4413/diff Testing ------- unit Thanks, Kenneth
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4413/
        -----------------------------------------------------------

        Review request for qpid and Alan Conway.

        Summary
        -------

        When an auto-delete queue is created, a timer is used to detect when the queue should be deleted. Since the timer is managed by the queue, the name of the timer is unimportant, and the queue code uses the same name for all timers.

        This creates a problem for the clustering code, which must monitor each timer. This patch changes the queue code to create a timer with a name based on the name of the queue. This allows the cluster code to correctly distinguish these timers.

        This addresses bug qpid-3896.
        https://issues.apache.org/jira/browse/qpid-3896

        Diffs


        /trunk/qpid/cpp/src/qpid/broker/Queue.cpp 1302629
        /trunk/qpid/cpp/src/qpid/cluster/ClusterTimer.cpp 1302629

        Diff: https://reviews.apache.org/r/4413/diff

        Testing
        -------

        unit

        Thanks,

        Kenneth

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4413/ ----------------------------------------------------------- Review request for qpid and Alan Conway. Summary ------- When an auto-delete queue is created, a timer is used to detect when the queue should be deleted. Since the timer is managed by the queue, the name of the timer is unimportant, and the queue code uses the same name for all timers. This creates a problem for the clustering code, which must monitor each timer. This patch changes the queue code to create a timer with a name based on the name of the queue. This allows the cluster code to correctly distinguish these timers. This addresses bug qpid-3896. https://issues.apache.org/jira/browse/qpid-3896 Diffs /trunk/qpid/cpp/src/qpid/broker/Queue.cpp 1302629 /trunk/qpid/cpp/src/qpid/cluster/ClusterTimer.cpp 1302629 Diff: https://reviews.apache.org/r/4413/diff Testing ------- unit Thanks, Kenneth
        Hide
        Ken Giusti added a comment -

        Fixup of the previous (invalid) patch - naming must be consistent amoung all members of the cluster, so use the queue's name not a random uuid!

        Also, added assertClusterSafe() call, per Alan's suggestion.

        Show
        Ken Giusti added a comment - Fixup of the previous (invalid) patch - naming must be consistent amoung all members of the cluster, so use the queue's name not a random uuid! Also, added assertClusterSafe() call, per Alan's suggestion.
        Hide
        Ken Giusti added a comment -

        A work-around patch that augments the Queue's auto delete timer with a unique uuid string. This prevents ClusterTimer::add() from getting different timer tasks with the same name.

        Show
        Ken Giusti added a comment - A work-around patch that augments the Queue's auto delete timer with a unique uuid string. This prevents ClusterTimer::add() from getting different timer tasks with the same name.

          People

          • Assignee:
            Ken Giusti
            Reporter:
            Ken Giusti
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development