Uploaded image for project: 'Qpid Proton'
  1. Qpid Proton
  2. PROTON-2422

Proton will sometimes fail to send empty frame if the idle timeout ratio between peers is greater than 2.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • proton-c-0.33.0, proton-c-0.35.0
    • proton-c-0.36.0
    • cpp-binding, proton-c
    • None
    • RHEL 8

    Description

      When a connection is made to a proton listener with both sides having different idle timeout intervals, the epoll_timer can fail to trigger, resulting in no empty frames being sent, and the connection dropped with a 'amqp:resource-limit-exceeded: local-idle-timeout expired' exception.

      Instrumentation of the proton library showed that when the an epoll timer deadline was rolled back and the timer resequenced due to the peer idle timeout being shorter than the local, the new timer is pushed on the timer manager heap incorrectly. The timer deadline object should be pushed on the timer heap in order by deadline, by in fact the timer is pushed on the head by timer deadline object address. This causes the invalidated timer to be first on the list, and the proactor timer set incorrectly. When enough time has elapsed, the remote peer will close the connection due to inactivity. Note that if the newly created resequenced timer deadline object has an address lower than the old invalidated timer deadline object, proton will work correctly.

      I've attached a reproducer as well as a patch for the instrumentation. Annotated proton logging from the reproducer is below.

      This issue does not occur prior to 0.33.0

      [builder@SE-RHEL8-ITCM-TEST-01 qpid-proton-idle-timeout-repo $ ] PN_LOG='frame info+' ./a.out 
      listening on 9030
      # The initial connection
      [0x7fdb3c001be0]: SASL:FRAME: -> SASL
      [0x7fdb44002620]: SASL:FRAME: <- SASL
      [0x7fdb44002620]: SASL:FRAME: -> SASL
      [0x7fdb44002620]: AMQP:FRAME:0 -> @sasl-mechanisms(64) [sasl-server-mechanisms=@PN_SYMBOL[:ANONYMOUS]]
      [0x7fdb3c001be0]: SASL:FRAME: <- SASL
      [0x7fdb3c001be0]: AMQP:FRAME:0 <- @sasl-mechanisms(64) [sasl-server-mechanisms=@PN_SYMBOL[:ANONYMOUS]]
      [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb3c008570, deadline 5189836829, proactor_timer* 0x10B9660
      [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump
      [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
      [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump post
      [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, td->list_deadline=5189836829, td->timer=0x00007fdb3c008570, td->resequenced=false
      [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
      [0x7fdb3c001be0]: AMQP:FRAME:0 -> @sasl-init(65) [mechanism=:ANONYMOUS, initial-response=b"anonymous@SE-RHEL8-ITCM-TEST-01"]
      [0x7fdb44002620]: AMQP:FRAME:0 <- @sasl-init(65) [mechanism=:ANONYMOUS, initial-response=b"anonymous@SE-RHEL8-ITCM-TEST-01"]
      [0x7fdb44002620]: SASL: INFO:Authenticated user: anonymous for anonymous with mechanism ANONYMOUS
      [0x7fdb44002620]: AMQP:FRAME:0 -> @sasl-outcome(68) [code=0]
      [0x7fdb3c001be0]: AMQP:FRAME:0 <- @sasl-outcome(68) [code=0]
      [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb3c008570, deadline 5189836829, proactor_timer* 0x10B9660
      [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump
      [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, td->list_deadline=5189836829, td->timer=0x00007fdb3c008570, td->resequenced=false
      [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
      [0x7fdb3c001be0]: AMQP:FRAME: -> AMQP
      [0x7fdb3c001be0]: AMQP:FRAME:0 -> @open(16) [container-id="cf87e911-f46b-471a-a664-e34de8a57b6b", hostname="127.0.0.1", channel-max=32767, idle-time-out=20000]
      [0x7fdb44002620]: AMQP:FRAME: <- AMQP
      [0x7fdb44002620]: AMQP:FRAME:0 <- @open(16) [container-id="cf87e911-f46b-471a-a664-e34de8a57b6b", hostname="127.0.0.1", channel-max=32767, idle-time-out=20000]
      [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb44008fb0, deadline 5189806830, proactor_timer* 0x10B7D30
      [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump
      [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
      [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump post
      [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb44008ff0, td->list_deadline=5189806830, td->timer=0x00007fdb44008fb0, td->resequenced=false
      [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
      [0x7fdb44002620]: AMQP:FRAME: -> AMQP
      [0x7fdb44002620]: AMQP:FRAME:0 -> @open(16) [container-id="6fb1e956-e9ca-4922-95a7-072fc249dc24", channel-max=32767, idle-time-out=5000]
      [0x7fdb3c001be0]: AMQP:FRAME: <- AMQP
      [0x7fdb3c001be0]: AMQP:FRAME:0 <- @open(16) [container-id="6fb1e956-e9ca-4922-95a7-072fc249dc24", channel-max=32767, idle-time-out=5000]
      # Here the client has connected to the listener. The client timeout is 40 seconds, and the listener idle timeout is 10 seconds. The proactor timer for this connection
      # is re-sequenced and the new timer deadline object is placed on the heap. This is placed in address order, and not list_deadline order as it should be.
      [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb3c008570, deadline 5189799330, proactor_timer* 0x10B9660
      [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump
      [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, td->list_deadline=5189836829, td->timer=0x00007fdb3c008570, td->resequenced=false
      [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
      [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump post
      [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, td->list_deadline=5189836829, td->timer=0x0000000000000000, td->resequenced=false
      [0x7fdb4ca21e20]:EVENT: INFO:Heap position 1: td=0x0167fdb3c00a930, td->list_deadline=5189799330, td->timer=0x00007fdb3c008570, td->resequenced=true
      [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
      # After 10 seconds, the listener server has closed the connection due to inactivity.
      [0x7fdb44002620]: AMQP:ERROR:amqp:resource-limit-exceeded local-idle-timeout expired
      [0x7fdb44002620]: AMQP:FRAME:0 -> @close(24) [error=@error(29) [condition=:"amqp:resource-limit-exceeded", description="local-idle-timeout expired"]]
      [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb44008fb0, deadline 5189816830, proactor_timer* 0x10B7D30
      [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump
      [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
      [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump post
      [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb44008ff0, td->list_deadline=5189816830, td->timer=0x00007fdb44008fb0, td->resequenced=false
      [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
      [0x7fdb44002620]: IO:FRAME: -> EOS
      [0x7fdb3c001be0]: AMQP:FRAME:0 <- @close(24) [error=@error(29) [condition=:"amqp:resource-limit-exceeded", description="local-idle-timeout expired"]]
      [0x7fdb3c001be0]: IO:FRAME: <- EOS
      [0x7fdb3c001be0]: AMQP:FRAME:0 -> (EMPTY FRAME)
      [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb3c008570, deadline 5189809330, proactor_timer* 0x10B9660
      [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump
      [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, td->list_deadline=5189836829, td->timer=0x0000000000000000, td->resequenced=false
      [0x7fdb4ca21e20]:EVENT: INFO:Heap position 1: td=0x0167fdb3c00a930, td->list_deadline=5189799330, td->timer=0x00007fdb3c008570, td->resequenced=true
      [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
      [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump post
      [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, td->list_deadline=5189836829, td->timer=0x0000000000000000, td->resequenced=false
      [0x7fdb4ca21e20]:EVENT: INFO:Heap position 1: td=0x0167fdb3c00a930, td->list_deadline=5189799330, td->timer=0x00007fdb3c008570, td->resequenced=true
      [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
      amqp:resource-limit-exceeded: local-idle-timeout expired
      amqp:resource-limit-exceeded: local-idle-timeout expired
      Sender was disconnected. Test Failed.
      

       

       

       

      Attachments

        1. instrument.patch
          2 kB
          Jesse Hulsizer
        2. reproducer.cpp
          3 kB
          Jesse Hulsizer

        Issue Links

          Activity

            People

              cliffjansen Clifford Jansen
              jhulsizer Jesse Hulsizer
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: