Uploaded image for project: 'Qpid Dispatch'
  1. Qpid Dispatch
  2. DISPATCH-1968

Crash after running series of 1Mb iperf3 against TCP adaptor

Agile BoardAttach filesAttach ScreenshotVotersStop watchingWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.15.0
    • 1.16.0
    • Protocol Adaptors
    • None
    • Fedora 32 bare metal 64-bit.

      Dispatch at 1.15 release

      Proton git branch master @ 5e7d7af8f

    Description

      Setup

      Running with a minimal TCP adaptor listener / connector on a single router. See attached INTA.conf. These processes run on a single laptop.

      Start a iperf3 server on default port 5201:
          iperf3 -s

      Run iperf3 client in a loop to port 5202 served by the TCP adaptor.
          iperf3 -c hostname -p 5202 -n 1000000

      Issues

      After a few loops the router crashes with malloc having a corrupted doubly linked list.

      Sometimes the test client hangs for a few seconds until the iperf server times out.

      Qdstat shows many resource leaks of  qd_buffer_t and stream data objects.

      Observations

      Tracing a single iperf3 session

      A wireshark trace of a single iperf3 session shows the client opening two connections to the router and the router opening two connections to the server. This is expected.

      As the test runs there is a certain amount of chat between the client and server that works as expected. These messages are test setup and are not part of the iperf mission payload data.

      Then the payload data starts. After the server has accepted 8kbytes of iperf payload (in 16 512-byte network packets!!!) the server closes the connection to the TCP connector with a FIN. A few microseconds later the TCP connector sends another 512-byte packet to which the the iperf server responds with a RST.

      Shortly thereafter the connections close with a bunch of TCP FIN packets.

      The router did not crash.

      Running with asan and valgrind memcheck

      Running with either of these tools was inconslusive and did not reveal any stray memory writes or double frees that could corrupt the malloc heap.

      Next steps

      Having the network peer of the TCP connector close the connection mid-stream is a pattern that is not tested in the self tests. A test to generate this pattern is in progress.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            chug Charles E. Rolke
            Votes:
            0 Vote for this issue
            Watchers:
            3 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment