Uploaded image for project: 'Qpid Dispatch'
  1. Qpid Dispatch
  2. DISPATCH-1110

Intermittent router hang while running QIT's AMQP large content test

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.4.0
    • None
    • None
    • Standard QIT environment.

      Once QIT is built and installed, the environment is set using the config.sh file. See QUICKSTART for details.

    Description

      When running the Qpid Interop Test's AMQP large content test, a stand-alone router will intermittently hang and cause the test to time out.

      The failure appears to be limited to either the AMQP list or map types, and usually with the C++ client as the message sender.  The C++, Python2 and Python3 as receiver clients have all seen this failure, but the Python2 receiver client seems to reproduce more readily on my hardware.

      In all cases, the test fails when the router sends what I suppose is the final transfer of a large message (I have not added up/counted the bytes of the many preceding transfers) to the consumer. The consumer then sends a disposition, but the router does not respond again until the test times out. The consumer can be seen to send heartbeats to the router, but the router does not send any of its own.

      ... (plenty of 65550-sized frames R->C)
      R->C 5976	3.454766	::1	::1	AMQP	65550
      R->C 5977	3.454775	::1	::1	AMQP	65550
      R->C 5978	3.454783	::1	::1	AMQP	48171
      C->R 5982	3.529881	::1	::1	AMQP	115	disposition
      C->R 5984	7.530704	::1	::1	AMQP	94	(empty)
      C->R 5986	11.532306	::1	::1	AMQP	94	(empty)
      ...

      There are no errors to be seen in the router logs other than when the consuming client is killed owing to the test timeout.

      ...
      2018-08-29 12:50:23.191754 -0400 SERVER (info) [14]: Accepted connection to ::1:amqp from ::1:37262
      2018-08-29 12:51:19.562695 -0400 SERVER (info) [14]: Connection from ::1:37262 (to ::1:amqp) failed: amqp:connection:framing-error connection aborted
      

      The reproducer is not very tight on this, and the error occurs about 50% of the time on my hardware.

      Attachments

        1. qdrouterd.conf
          2 kB
          Kim van der Riet

        Issue Links

          Activity

            People

              gmurthy Ganesh Murthy
              kpvdr Kim van der Riet
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: