Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Won't Fix
-
proton-0.7, proton-0.8
-
None
-
Red Hat Enterprise Linux 6.5
kernel: 2.6.32-431.1.2.el6.x86_64
qpid-proton 0.7 and 9939b8a990cd53c1b5e099c083bdcf61ad22232b git-svn-id: https://svn.apache.org/repos/asf/qpid/proton/trunk@1613151 13f79535-47bb-0310-9956-ffa450edef68
Description
If I try to connect to a closed port with a messenger, pn_messenger_recv outputs messages to stderr and then spins at high CPU usage, rather than returning with an error as expected.
This seems to be impacted by kernel version. I have a RHEL 6.5 machine which demonstrates this problem reliably when using kernel 2.6.32-431.1.2.el6.x86_64 and not when using 3.10.28-1.el6.elrepo.x86_64 .
This can be easily reproduced using the "recv" example in the qpid-proton sources.
$ build/examples/messenger/c/recv amqp://127.0.0.1:1 recv: Connection refused [0x63d8e0]:ERROR amqp:connection:framing-error SASL header mismatch: '' CONNECTION ERROR connection aborted (remote) # hangs at this point with high CPU usage
Compare with the behavior on a later kernel version, which seems right:
$ build/examples/messenger/c/recv amqp://127.0.0.1:1 recv: Connection refused [0x15af8e0]:ERROR amqp:connection:framing-error SASL header mismatch: '' CONNECTION ERROR connection aborted (remote) send: Broken pipe /home/rmcgover/src/qpid-proton/examples/messenger/c/recv.c:132: no valid sources # exits with exit code 1
Here's a sample backtrace when the hang is occurring:
(gdb) bt #0 0x00007ffff7ffea11 in clock_gettime () #1 0x0000003a51e03e46 in clock_gettime () from /lib64/librt.so.1 #2 0x00007ffff7de6b5e in pn_i_now () from /home/rmcgover/src/qpid-proton/build/proton-c/libqpid-proton.so.2 #3 0x00007ffff7de4c06 in pn_selector_select () from /home/rmcgover/src/qpid-proton/build/proton-c/libqpid-proton.so.2 #4 0x00007ffff7ddf736 in pni_wait () from /home/rmcgover/src/qpid-proton/build/proton-c/libqpid-proton.so.2 #5 0x00007ffff7ddf869 in pn_messenger_tsync () from /home/rmcgover/src/qpid-proton/build/proton-c/libqpid-proton.so.2 #6 0x00007ffff7ddf8df in pn_messenger_sync () from /home/rmcgover/src/qpid-proton/build/proton-c/libqpid-proton.so.2 #7 0x00007ffff7de1676 in pn_messenger_recv () from /home/rmcgover/src/qpid-proton/build/proton-c/libqpid-proton.so.2 #8 0x00000000004014b2 in main ()
There's a while(true) loop in pn_messenger_tsync which seems like it never escapes. strace also shows that the process is repeatedly doing a poll.