Qpid
  1. Qpid
  2. QPID-1909

Consumer with byte credit can get ignored if a large message "eclipses" a small one.

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.5
    • Fix Version/s: Future
    • Component/s: C++ Broker
    • Labels:
      None

      Description

      Given: A consumer with byte credit N for a queue with messages Big size > N and Small size < N
      The consumer should not be able to consume Big. Now remove Big with a different consumer.
      The consumer should now be able to consume Small, but doesn't

      The problem in Queue.cpp is twofold:

      • if a consumer returns "false" from accept, it is assumed to be blocked and is removed from the listener set. It won't be notified of new messages
      • the queue only notifies when new messages arrive. It does not notify when a blocking message is removed.

      This is demonstrated by adding the following to test_credit_flow_bytes in message.py:

      1. Check for bug that allowed a small message to be hidden by a larger one.
        session.message_transfer(message=Message(session.delivery_properties(routing_key="q"), "toobigtoobigtoobigtoobig"))
        session.message_transfer(message=Message(session.delivery_properties(routing_key="q"), "abcdefgh"))
        session.message_flow(unit = session.credit_unit.byte, value = msg_size, destination = "c")
        try:
        q.get(timeout = 1) # Should fail, message is too big.
        self.fail("Expected Empty exception")
        except Empty: None
      1. Create another consumer to remove the large message
        session.message_subscribe(queue = "q", destination = "c2")
        session.message_flow(unit = session.credit_unit.byte, value = 0xFFFFFFFFL, destination = "c2")
        session.message_flow(unit = session.credit_unit.message, value = 1, destination = "c2")
        self.assertDataEquals(session, session.incoming("c2").get(timeout=1), "toobigtoobigtoobigtoobig")
      1. Now the first consumer should be able to receive the smaller message.
        self.assertDataEquals(session, q.get(timeout = 1), "abcdefgh")
      1. QPID-1909.patch
        6 kB
        Jonathan Robie

        Activity

        Hide
        Jonathan Robie added a comment -

        Queue::consumeNextMessage() returns either CONSUME or CANT_CONSUME, it does not distinguish "can't consume because I have no credit and can't read until I get more credit" from "can't consume because I don't have enough credit for this particular message, but might be able to read a different message".

        The obvious fix for this bug would be to distinguish these two cases, and keep the consumer on the listener list if it has some credit, but not enough for a given message. But that fix has a problem: suppose a consumer has 1 byte credit, it remains on the listener list, and must be considered for incoming messages, but it will never receive a message, it just slows down the search for a consumer for a given message. And it's likely that many consumers would reach that state.

        Another possible solution: require a minimum level of credit, and kick out consumers that do not have that much credit. But there's no obvious way to determine a reasonable minimum credit.

        Another possible solution: use a data structure that makes it possible to select a consumer that has at least N bytes of credit. This would necessarily be slower than just picking the next available consumer, and would require a more complex data structure.

        None of these possible solutions thrill me. Is there a better one?

        Show
        Jonathan Robie added a comment - Queue::consumeNextMessage() returns either CONSUME or CANT_CONSUME, it does not distinguish "can't consume because I have no credit and can't read until I get more credit" from "can't consume because I don't have enough credit for this particular message, but might be able to read a different message". The obvious fix for this bug would be to distinguish these two cases, and keep the consumer on the listener list if it has some credit, but not enough for a given message. But that fix has a problem: suppose a consumer has 1 byte credit, it remains on the listener list, and must be considered for incoming messages, but it will never receive a message, it just slows down the search for a consumer for a given message. And it's likely that many consumers would reach that state. Another possible solution: require a minimum level of credit, and kick out consumers that do not have that much credit. But there's no obvious way to determine a reasonable minimum credit. Another possible solution: use a data structure that makes it possible to select a consumer that has at least N bytes of credit. This would necessarily be slower than just picking the next available consumer, and would require a more complex data structure. None of these possible solutions thrill me. Is there a better one?
        Hide
        Jonathan Robie added a comment -

        I'm considering using the "obvious fix" from the previous comment.

        Consumers will eventually ack, raising their credit again, fragmentation is likely to be an issue only temporarily.

        Show
        Jonathan Robie added a comment - I'm considering using the "obvious fix" from the previous comment. Consumers will eventually ack, raising their credit again, fragmentation is likely to be an issue only temporarily.
        Hide
        Jonathan Robie added a comment -

        Fixes the bug. Includes Alan's test.

        Show
        Jonathan Robie added a comment - Fixes the bug. Includes Alan's test.
        Hide
        Jonathan Robie added a comment -

        Fixed in QPID-1909.patch, attached.

        Show
        Jonathan Robie added a comment - Fixed in QPID-1909 .patch, attached.
        Hide
        Gordon Sim added a comment -

        You have some extraneous changes in the patch I think.

        Show
        Gordon Sim added a comment - You have some extraneous changes in the patch I think.
        Hide
        Alan Conway added a comment -

        In a cluster, the paupers list will need to be replicated as part of an update to a new member.

        Follow the example of the existing code to replicate listeners:

        QueueListener::eachListener() called by cluster::UpdateClient::updateQueueListeners()
        which calls cluster::UpdateClient::updateQueueListener()
        which calls addQueueListener generated from cpp/xml/cluster.xml <control add-queue-listener>
        which sends a control that gets dispatched to cluster::Connection::addQueueListener
        which calls QpidListener::addListener()

        I think the whole chain needs to be duplicated for pauapers

        Finally bump up the cluster::CLUSTER_VERSION number since this is a change in the cluster protocol.

        I don't much like the term "paupers" but I can't think of a better term.

        Show
        Alan Conway added a comment - In a cluster, the paupers list will need to be replicated as part of an update to a new member. Follow the example of the existing code to replicate listeners: QueueListener::eachListener() called by cluster::UpdateClient::updateQueueListeners() which calls cluster::UpdateClient::updateQueueListener() which calls addQueueListener generated from cpp/xml/cluster.xml <control add-queue-listener> which sends a control that gets dispatched to cluster::Connection::addQueueListener which calls QpidListener::addListener() I think the whole chain needs to be duplicated for pauapers Finally bump up the cluster::CLUSTER_VERSION number since this is a change in the cluster protocol. I don't much like the term "paupers" but I can't think of a better term.
        Hide
        Jonathan Robie added a comment -

        Cleaned up the patch.

        Replicating the pauper's list should be straightforward, testing to make sure it's replicated may be more difficult. Is there currently a test for that with listeners?

        Show
        Jonathan Robie added a comment - Cleaned up the patch. Replicating the pauper's list should be straightforward, testing to make sure it's replicated may be more difficult. Is there currently a test for that with listeners?
        Hide
        Alan Conway added a comment -

        It should be possible to use a test just like the one in the description but adding a new broker to the cluster just after creating the pauper and then consuming from the new broker.

        Show
        Alan Conway added a comment - It should be possible to use a test just like the one in the description but adding a new broker to the cluster just after creating the pauper and then consuming from the new broker.

          People

          • Assignee:
            Gordon Sim
            Reporter:
            Alan Conway
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development