Uploaded image for project: 'Traffic Server'
  1. Traffic Server
  2. TS-413

assert in cache when write buffer contains over 256k of data

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.0.1
    • 2.1.3
    • Cache
    • None

    Description

      I'm working on a trafficserver plugin - and if I spin my ~40 case test
      suite through TS repeatedly I can pretty easily reproduce the assert
      below. (It is not deterministic, sometimes it takes just a couple
      iterations, sometimes dozens.)

      Ts is 2.0.0 that I compiled myself in debug mode on 64 bit linux.

      I've run it through valgrind and haven't seen my plugin doing anything
      suspicious - but I cannot reproduce the assert without the plugin
      installed. The plugin does change the document size and content.

      Unfortuantely, I don't own the copyright on the plugin code, so I cannot
      post it in whole.

      Anyone have any thoughts on what I might be doing wrong, or if there is
      a caching problem related to adaptations that change document sizes?

      FATAL: CacheDir.cc:693: failed assert `(unsigned int) dir_approx_size(dir) <= (unsigned int) (MAX_FRAG_SIZE + sizeofDoc)`
      /home/mcmanus/ts/bin/traffic_server - STACK TRACE:
      /home/mcmanus/ts/bin/traffic_server(ink_fatal_va+0xcb)[0x7579bb]
      /home/mcmanus/ts/bin/traffic_server(ink_fatal+0xc7)[0x757aa5]
      /home/mcmanus/ts/bin/traffic_server(_ink_assert+0xdb)[0x756bff]
      /home/mcmanus/ts/bin/traffic_server(_Z13dir_overwriteP7INK_MD5P4PartP3DirS4_b+0x18d)[0x6e8838]
      /home/mcmanus/ts/bin/traffic_server(_ZN7CacheVC22openWriteCloseHeadDoneEiP5Event+0x189)[0x70a2d1]
      /home/mcmanus/ts/bin/traffic_server(_ZN12Continuation11handleEventEiPv+0x6c)[0x4f1c22]
      /home/mcmanus/ts/bin/traffic_server(_ZN12PartCallback12aggWriteDoneEiP5Event+0x137)[0x7055dd]
      /home/mcmanus/ts/bin/traffic_server(_ZN12Continuation11handleEventEiPv+0x6c)[0x4f1c22]
      /home/mcmanus/ts/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x131)[0x74e66f]
      /home/mcmanus/ts/bin/traffic_server(_ZN7EThread7executeEv+0x31e)[0x74eb12]
      /home/mcmanus/ts/bin/traffic_server[0x74d1c6]
      /lib/libpthread.so.0(+0x69ca)[0x7ffff7bc59ca]
      /lib/libc.so.6(clone+0x6d)[0x7ffff59aa6fd]

      Program received signal SIGABRT, Aborted.
      [Switching to Thread 0x7ffff46a3710 (LWP 24261)]
      0x00007ffff58f7a75 in *__GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
      64 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
      in ../nptl/sysdeps/unix/sysv/linux/raise.c
      (gdb) bt
      #0 0x00007ffff58f7a75 in *__GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
      #1 0x00007ffff58fb5c0 in *__GI_abort () at abort.c:92
      #2 0x00000000007578f0 in ink_die_die_die (retval=1) at ink_error.cc:43
      #3 0x00000000007579c8 in ink_fatal_va (return_code=1,
      message_format=0x7ffff46a29b0 "CacheDir.cc:693: failed assert `(unsigned int) dir_approx_size(dir) <= (unsigned int) (MAX_FRAG_SIZE + sizeofDoc)`", ap=0x7ffff46a28b0)
      at ink_error.cc:67
      #4 0x0000000000757aa5 in ink_fatal (return_code=1,
      message_format=0x7ffff46a29b0 "CacheDir.cc:693: failed assert `(unsigned int) dir_approx_size(dir) <= (unsigned int) (MAX_FRAG_SIZE + sizeofDoc)`") at ink_error.cc:75
      #5 0x0000000000756bff in _ink_assert (a=0x7ec9e0 "(unsigned int) dir_approx_size(dir) <= (unsigned int) (MAX_FRAG_SIZE + sizeofDoc)", f=0x7ec64c "CacheDir.cc", l=693)
      at ink_assert.cc:47
      #6 0x00000000006e8838 in dir_overwrite (key=0x1c21fe0, d=0x19d04d0, dir=0x1c22010, overwrite=0x1c370b8, must_overwrite=false) at CacheDir.cc:693
      #7 0x000000000070a2d1 in CacheVC::openWriteCloseHeadDone (this=0x1c21f90, event=3900, e=0x0) at CacheWrite.cc:1170
      #8 0x00000000004f1c22 in Continuation::handleEvent (this=0x1c21f90, event=3900, data=0x0) at ../iocore/eventsystem/I_Continuation.h:147
      #9 0x00000000007055dd in PartCallback::aggWriteDone (this=0x19cdda0, event=1, e=0x1bc95e0) at CacheWrite.cc:327
      #10 0x00000000004f1c22 in Continuation::handleEvent (this=0x19cdda0, event=1, data=0x1bc95e0) at ../iocore/eventsystem/I_Continuation.h:147
      #11 0x000000000074e66f in EThread::process_event (this=0x7ffff4aa8010, e=0x1bc95e0, calling_code=1) at UnixEThread.cc:113
      #12 0x000000000074eb12 in EThread::execute (this=0x7ffff4aa8010) at UnixEThread.cc:219
      #13 0x000000000074d1c6 in spawn_thread_internal (a=0xc090c0) at Thread.cc:85
      #14 0x00007ffff7bc59ca in start_thread (arg=<value optimized out>) at pthread_create.c:300
      #15 0x00007ffff59aa6fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
      #16 0x0000000000000000 in ?? ()
      (gdb)

      Some more information

      • The test in question reduces the document from 281KB to 259KB - the
        headers are not altered.
      • The test just runs curl serially fetching the same URL (with the same
        transform) 750 times.. the assert() will fail in the first few dozen
        iterations.
      • I can now see a repeatable timeline:

      + transaction N goes through INK_EVENT_VCONN_WRITE_COMPLETE (having
      sent all its transformed data to the user agent)
      + transaction N+1 is started for the same URL as N
      + transaction N receives INK_EVENT_HTTP_TXN_CLOSE
      + assert() fails

      It appears from the logs that if N and N+1 do not overlap there is no
      assert failure.

      Attachments

        1. TS-413-jp-1.patch
          1 kB
          John Plevyak
        2. TS-413-jp-2.patch
          4 kB
          John Plevyak
        3. TS-413-mcmanus-1.patch
          2 kB
          patrick mcmanus

        Activity

          People

            jplevyak John Plevyak
            jplevyak John Plevyak
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: