Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Cannot Reproduce
-
0.6.1
-
None
-
Debian 8.3, Apache Qpid Proton 0.13.0 for drivers and dependencies, Hardware: 8 CPUs, 61 GB RAM, 30 GB HDD each on 5 separate machines
Description
We are running a network of 5 inter-connected routers each on a separate host. One of the routers in this network terminated after running successfully for 7+ days. The router was idle (not receiving/sending any messages) when this happened. After analyzing the core dump, it turned out to be something related to mutex lock in multithreading.
Here is the full bt:
Reading symbols from qpid-dispatch/qdrouterd...(no debugging symbols found)...done.
[New LWP 4082]
[New LWP 4088]
[New LWP 4030]
[New LWP 4089]
[New LWP 4090]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `qdrouterd -c /x/web/LIVE/switch-dr-network/configurator/qdrouterd.conf'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007ffaaf7e6067 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) up
#1 0x00007ffaaf7e7448 in __GI_abort () at abort.c:89
89 abort.c: No such file or directory.
(gdb) up
#2 0x00007ffaaf7df266 in __assert_fail_base (
fmt=0x7ffaaf918238 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
assertion=assertion@entry=0x7ffab0997baf "result == 0",
file=file@entry=0x7ffab0997b48 "/home/anhtran/qpid-package/dispatch/qpid-dispatch-0.6.x-Vanilla/src/posix/threading.c", line=line@entry=71, function=function@entry=0x7ffab0997cd1 "sys_mutex_lock")
at assert.c:92
92 assert.c: No such file or directory.
(gdb) up
#3 0x00007ffaaf7df312 in _GI__assert_fail (assertion=0x7ffab0997baf "result == 0",
file=0x7ffab0997b48 "/home/anhtran/qpid-package/dispatch/qpid-dispatch-0.6.x-Vanilla/src/posix/threading.c", line=71, function=0x7ffab0997cd1 "sys_mutex_lock") at assert.c:101
101 in assert.c
(gdb)
#4 0x00007ffab09802eb in sys_mutex_lock () from /usr/local/lib/qpid-dispatch/libqpid-dispatch.so
(gdb)
#5 0x00007ffab098863a in qdr_forward_deliver_CT ()
from /usr/local/lib/qpid-dispatch/libqpid-dispatch.so
(gdb)
#6 0x00007ffab0989274 in qdr_forward_closest_CT ()
from /usr/local/lib/qpid-dispatch/libqpid-dispatch.so
(gdb)
#7 0x00007ffab098dd98 in ?? () from /usr/local/lib/qpid-dispatch/libqpid-dispatch.so
(gdb)
#8 0x00007ffab098b45a in router_core_thread () from /usr/local/lib/qpid-dispatch/libqpid-dispatch.so
(gdb)
#9 0x00007ffab04f50a4 in start_thread (arg=0x7ffaad524700) at pthread_create.c:309
309 pthread_create.c: No such file or directory.
(gdb)
#10 0x00007ffaaf89987d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
111 ../sysdeps/unix/sysv/linux/x86_64/clone.S: No such file or directory.
(gdb)
Initial frame selected; you cannot go up.
(gdb)