Description
I'm working on a trafficserver plugin - and if I spin my ~40 case test
suite through TS repeatedly I can pretty easily reproduce the assert
below. (It is not deterministic, sometimes it takes just a couple
iterations, sometimes dozens.)
Ts is 2.0.0 that I compiled myself in debug mode on 64 bit linux.
I've run it through valgrind and haven't seen my plugin doing anything
suspicious - but I cannot reproduce the assert without the plugin
installed. The plugin does change the document size and content.
Unfortuantely, I don't own the copyright on the plugin code, so I cannot
post it in whole.
Anyone have any thoughts on what I might be doing wrong, or if there is
a caching problem related to adaptations that change document sizes?
FATAL: CacheDir.cc:693: failed assert `(unsigned int) dir_approx_size(dir) <= (unsigned int) (MAX_FRAG_SIZE + sizeofDoc)`
/home/mcmanus/ts/bin/traffic_server - STACK TRACE:
/home/mcmanus/ts/bin/traffic_server(ink_fatal_va+0xcb)[0x7579bb]
/home/mcmanus/ts/bin/traffic_server(ink_fatal+0xc7)[0x757aa5]
/home/mcmanus/ts/bin/traffic_server(_ink_assert+0xdb)[0x756bff]
/home/mcmanus/ts/bin/traffic_server(_Z13dir_overwriteP7INK_MD5P4PartP3DirS4_b+0x18d)[0x6e8838]
/home/mcmanus/ts/bin/traffic_server(_ZN7CacheVC22openWriteCloseHeadDoneEiP5Event+0x189)[0x70a2d1]
/home/mcmanus/ts/bin/traffic_server(_ZN12Continuation11handleEventEiPv+0x6c)[0x4f1c22]
/home/mcmanus/ts/bin/traffic_server(_ZN12PartCallback12aggWriteDoneEiP5Event+0x137)[0x7055dd]
/home/mcmanus/ts/bin/traffic_server(_ZN12Continuation11handleEventEiPv+0x6c)[0x4f1c22]
/home/mcmanus/ts/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x131)[0x74e66f]
/home/mcmanus/ts/bin/traffic_server(_ZN7EThread7executeEv+0x31e)[0x74eb12]
/home/mcmanus/ts/bin/traffic_server[0x74d1c6]
/lib/libpthread.so.0(+0x69ca)[0x7ffff7bc59ca]
/lib/libc.so.6(clone+0x6d)[0x7ffff59aa6fd]
Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff46a3710 (LWP 24261)]
0x00007ffff58f7a75 in *__GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
in ../nptl/sysdeps/unix/sysv/linux/raise.c
(gdb) bt
#0 0x00007ffff58f7a75 in *__GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x00007ffff58fb5c0 in *__GI_abort () at abort.c:92
#2 0x00000000007578f0 in ink_die_die_die (retval=1) at ink_error.cc:43
#3 0x00000000007579c8 in ink_fatal_va (return_code=1,
message_format=0x7ffff46a29b0 "CacheDir.cc:693: failed assert `(unsigned int) dir_approx_size(dir) <= (unsigned int) (MAX_FRAG_SIZE + sizeofDoc)`", ap=0x7ffff46a28b0)
at ink_error.cc:67
#4 0x0000000000757aa5 in ink_fatal (return_code=1,
message_format=0x7ffff46a29b0 "CacheDir.cc:693: failed assert `(unsigned int) dir_approx_size(dir) <= (unsigned int) (MAX_FRAG_SIZE + sizeofDoc)`") at ink_error.cc:75
#5 0x0000000000756bff in _ink_assert (a=0x7ec9e0 "(unsigned int) dir_approx_size(dir) <= (unsigned int) (MAX_FRAG_SIZE + sizeofDoc)", f=0x7ec64c "CacheDir.cc", l=693)
at ink_assert.cc:47
#6 0x00000000006e8838 in dir_overwrite (key=0x1c21fe0, d=0x19d04d0, dir=0x1c22010, overwrite=0x1c370b8, must_overwrite=false) at CacheDir.cc:693
#7 0x000000000070a2d1 in CacheVC::openWriteCloseHeadDone (this=0x1c21f90, event=3900, e=0x0) at CacheWrite.cc:1170
#8 0x00000000004f1c22 in Continuation::handleEvent (this=0x1c21f90, event=3900, data=0x0) at ../iocore/eventsystem/I_Continuation.h:147
#9 0x00000000007055dd in PartCallback::aggWriteDone (this=0x19cdda0, event=1, e=0x1bc95e0) at CacheWrite.cc:327
#10 0x00000000004f1c22 in Continuation::handleEvent (this=0x19cdda0, event=1, data=0x1bc95e0) at ../iocore/eventsystem/I_Continuation.h:147
#11 0x000000000074e66f in EThread::process_event (this=0x7ffff4aa8010, e=0x1bc95e0, calling_code=1) at UnixEThread.cc:113
#12 0x000000000074eb12 in EThread::execute (this=0x7ffff4aa8010) at UnixEThread.cc:219
#13 0x000000000074d1c6 in spawn_thread_internal (a=0xc090c0) at Thread.cc:85
#14 0x00007ffff7bc59ca in start_thread (arg=<value optimized out>) at pthread_create.c:300
#15 0x00007ffff59aa6fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#16 0x0000000000000000 in ?? ()
(gdb)
Some more information
- The test in question reduces the document from 281KB to 259KB - the
headers are not altered.
- The test just runs curl serially fetching the same URL (with the same
transform) 750 times.. the assert() will fail in the first few dozen
iterations.
- I can now see a repeatable timeline:
+ transaction N goes through INK_EVENT_VCONN_WRITE_COMPLETE (having
sent all its transformed data to the user agent)
+ transaction N+1 is started for the same URL as N
+ transaction N receives INK_EVENT_HTTP_TXN_CLOSE
+ assert() fails
It appears from the logs that if N and N+1 do not overlap there is no
assert failure.