Description
We are use using Apache qpid version: 1.38
Internally using: qpid-proton-0.21.0
We got coredump when multiple acknowledge:
#0 0x00007f234f09b207 in raise () from /lib64/libc.so.6
#1 0x00007f234f09c8f8 in abort () from /lib64/libc.so.6
#2 0x00007f2352987bc6 in ?? () from /opt/gts/usr/lib64/libbaseshell.so.1.3
#3 <signal handler called>
#4 pni_add_tpwork (delivery=0x7f215802a090)
at /sw/int/prisma/contint/thirdparty/qpid-proton/000.004.002/8300001.5/x86_64-el7-linux-gnu-nvcc_gcc-v200-relwithdebinfofile/_CPackGTS_Packages/workspace/RPM/BUILD/qpid-proton-0.21.0/proton-c/src/core/engine.c:728
#5 0x00007f235d08f9da in qpid::messaging::amqp::SessionContext::acknowledge (this=this@entry=0x7f21fc000eb0, begin=..., begin@entry=..., end=...)
at /sw/int/prisma/contint/thirdparty/qpid-cpp/000.009.001/8300001.5/x86_64-el7-linux-gnu-nvcc_gcc-v200-relwithdebinfofile/_CPackGTS_Packages/workspace/RPM/BUILD/qpid-1.38.0/src/qpid/messaging/amqp/SessionContext.cpp:166
#6 0x00007f235d0909b0 in qpid::messaging::amqp::SessionContext::acknowledge (this=<optimized out>, id=..., cumulative=cumulative@entry=true)
at /sw/int/prisma/contint/thirdparty/qpid-cpp/000.009.001/8300001.5/x86_64-el7-linux-gnu-nvcc_gcc-v200-relwithdebinfofile/_CPackGTS_Packages/workspace/RPM/BUILD/qpid-1.38.0/src/qpid/messaging/amqp/SessionContext.cpp:187
#7 0x00007f235d066e48 in qpid::messaging::amqp::ConnectionContext::acknowledgeLH (this=this@entry=0x7f2178023c70, ssn=..., message=message@entry=0x7f21783e9958,
cumulative=<optimized out>)
The issue is: delivery has been settled before settlement of same delivery:
From gdb:
(gdb) frame 4
#4 pni_add_tpwork (delivery=0x7f215802a090)
at /sw/int/prisma/contint/thirdparty/qpid-proton/000.004.002/8300001.5/x86_64-el7-linux-gnu-nvcc_gcc-v200-relwithdebinfofile/_CPackGTS_Packages/workspace/RPM/BUILD/qpid-proton-0.21.0/proton-c/src/core/engine.c:728
728 in /sw/int/prisma/contint/thirdparty/qpid-proton/000.004.002/8300001.5/x86_64-el7-linux-gnu-nvcc_gcc-v200-relwithdebinfofile/_CPackGTS_Packages/workspace/RPM/BUILD/qpid-proton-0.21.0/proton-c/src/core/engine.c
(gdb) print *delivery
$3 = {local = {condition =
, type = 36, data = 0x7f215802a210, annotations = 0x7f215802a420,
section_offset = 0, section_number = 0, failed = false, undeliverable = false, settled = true}, remote = {condition =
, type = 0, data = 0x7f215802a8e0, annotations = 0x7f215802aaf0, section_offset = 0, section_number = 0, failed = false, undeliverable = false,
settled = true}, link = 0x0, tag = 0x7f215802a180, unsettled_next = 0x7f215802d010, unsettled_prev = 0x0, work_next = 0x7f215802d010, work_prev = 0x0,
tpwork_next = 0x7f215802d010, tpwork_prev = 0x0, state = {id = 3, sending = false, sent = false, init = false}, bytes = 0x7f215802a1d0, context = 0x7f215802afb0, updated = true,
settled = true, work = false, tpwork = false, done = true, referenced = false, aborted = false}
(gdb)
The problem occurred 5 times (rarely) when application starts receiving messages, normally does not occur for long time.
I tried to find cause of the issue, but I did not find the real source of problem.
I presume the issue may be fixed by one of the following solutions:
1) Add delivery to record only when the delivery is not in unacked (I presume one delivery stored twice or more in unacked upon several ID, but I can not confirm it from coredump):
qpid::framing::SequenceNumber SessionContext::record(pn_delivery_t* delivery)
2) Do settlement only when delivery has not been settled:
void SessionContext::acknowledge(DeliveryMap::iterator begin, DeliveryMap::iterator end)
{
error.raise();
for (DeliveryMap::iterator i = begin; i != end; ++i) {
types::Variant txState;
if (transaction)
else {
QPID_LOG(trace, "Setting disposition for delivery " << i->first << " > " << i>second);
if (!pn_delivery_settled(delivery)) {
ry_update(i->second, PN_ACCEPTED);
pn_delivery_settle(i->second); //TODO: different settlement modes?
}
}
}
unacked.erase(begin, end);
}