Uploaded image for project: 'Traffic Server'
  1. Traffic Server
  2. TS-2848

ATS crash in HttpSM::release_server_session

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • 6.0.0
    • HTTP

    Description

      We deploy ATS on production hosts, and noticed crashes with the following stack trace. This happens not very frequently, about 1 week or even longer. It crashes repeatedly in the last 2 months, however, the root cause is not found and we can not reproduce the crash as wish, only wait for it happens.

      NOTE: Traffic Server received Sig 11: Segmentation fault
      /home/y/bin/traffic_server - STACK TRACE:
      /lib64/libpthread.so.0(+0x321e60f500)[0x2b69adf8f500]
      /home/y/bin/traffic_server(_ZN6HttpSM22release_server_sessionEb+0x35)[0x529eb5]
      /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x2db)[0x5362bb]
      /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a]
      /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0]
      /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2]
      /home/y/bin/traffic_server(_ZN6HttpSM16do_hostdb_lookupEv+0x282)[0x51e422]
      /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0xbad)[0x536b8d]
      /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a]
      /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0]
      /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2]
      /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a]
      /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0]
      /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2]
      /home/y/bin/traffic_server(_ZN6HttpSM21state_cache_open_readEiPv+0xfe)[0x52ff8e]
      /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533098]
      /home/y/bin/traffic_server(_ZN11HttpCacheSM21state_cache_open_readEiPv+0x1b2)[0x50bef2]
      /home/y/bin/traffic_server(_ZN7CacheVC8callcontEi+0x53)[0x5f0a93]
      /home/y/bin/traffic_server(_ZN7CacheVC17openReadStartHeadEiP5Event+0x7cf)[0x65934f]
      /home/y/bin/traffic_server(_ZN5Cache9open_readEP12ContinuationP7INK_MD5P7HTTPHdrP21CacheLookupHttpConfig13CacheFragTypePci+0x383)[0x656373]
      /home/y/bin/traffic_server(_ZN14CacheProcessor9open_readEP12ContinuationP3URLbP7HTTPHdrP21CacheLookupHttpConfigl13CacheFragType+0xad)[0x633a6d]
      /home/y/bin/traffic_server(_ZN11HttpCacheSM9open_readEP3URLP7HTTPHdrP21CacheLookupHttpConfigl+0x94)[0x50b944]
      /home/y/bin/traffic_server(_ZN6HttpSM24do_cache_lookup_and_readEv+0xf3)[0x51d893]
      /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x722)[0x536702]
      /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x49d)[0x53546d]
      /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0]
      /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b]
      /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14]
      /home/y/libexec64/trafficserver/header_filter.so(+0x2d5d)[0x2b69c3471d5d]
      /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34]
      /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x85d)[0x53683d]
      /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a]
      /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0]
      /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b]
      /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14]
      /home/y/libexec64/trafficserver/header_rewrite.so(+0x1288d)[0x2b69c368888d]
      /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34]
      /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b]
      /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14]
      /home/y/libexec64/trafficserver/header_filter.so(+0x2d5d)[0x2b69c3471d5d]
      /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34]
      /home/y/bin/traffic_server(_ZN6HttpSM33state_read_server_response_headerEiPv+0x398)[0x530828]
      /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533098]
      /home/y/bin/traffic_server[0x68606b]
      /home/y/bin/traffic_server[0x688a14]
      /home/y/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x1f2)[0x681582]
      /home/y/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x8f)[0x6a89bf]
      /home/y/bin/traffic_server(_ZN7EThread7executeEv+0x4a3)[0x6a93a3]
      /home/y/bin/traffic_server[0x6a785a]
      /lib64/libpthread.so.0(+0x321e607851)[0x2b69adf87851]
      /lib64/libc.so.6(clone+0x6d)[0x321e2e890d]
      

      gdb back trace:

      (gdb) bt
      #0  0x0000000000529eb5 in HttpSM::release_server_session (this=0x2b12bc107bd0,
      serve_from_cache=true) at HttpSM.cc:4892
      #1  0x00000000005362bb in HttpSM::set_next_state (this=0x2b12bc107bd0) at
      HttpSM.cc:7010
      #2  0x000000000053537a in HttpSM::handle_api_return (this=0x2b12bc107bd0) at
      HttpSM.cc:1557
      #3  0x000000000052dbd0 in HttpSM::state_api_callout (this=0x2b12bc107bd0,
      event=0, data=0x0) at HttpSM.cc:1489
      #4  0x00000000005361d2 in HttpSM::set_next_state (this=0x2b12bc107bd0) at
      HttpSM.cc:6815
      #5  0x000000000051e422 in HttpSM::do_hostdb_lookup (this=0x2b12bc107bd0) at
      HttpSM.cc:3919
      #6  0x0000000000536b8d in HttpSM::set_next_state (this=0x2b12bc107bd0) at
      HttpSM.cc:6914
      #7  0x000000000053537a in HttpSM::handle_api_return (this=0x2b12bc107bd0) at
      HttpSM.cc:1557
      #8  0x000000000052dbd0 in HttpSM::state_api_callout (this=0x2b12bc107bd0,
      event=0, data=0x0) at HttpSM.cc:1489
      #9  0x00000000005361d2 in HttpSM::set_next_state (this=0x2b12bc107bd0) at
      HttpSM.cc:6815
      #10 0x000000000053537a in HttpSM::handle_api_return (this=0x2b12bc107bd0) at
      HttpSM.cc:1557
      #11 0x000000000052dbd0 in HttpSM::state_api_callout (this=0x2b12bc107bd0,
      event=0, data=0x0) at HttpSM.cc:1489
      #12 0x00000000005361d2 in HttpSM::set_next_state (this=0x2b12bc107bd0) at
      HttpSM.cc:6815
      #13 0x000000000052ff8e in HttpSM::state_cache_open_read (this=0x2b12bc107bd0,
      event=1102, data=0x2b13240e2190) at HttpSM.cc:2457
      #14 0x0000000000533098 in HttpSM::main_handler (this=0x2b12bc107bd0,
      event=1102, data=0x2b13240e2190) at HttpSM.cc:2516
      #15 0x000000000050bef2 in handleEvent (this=0x2b12bc109d88, event=<value
      optimized out>, data=0x2b13240e2190) at
      ../../iocore/eventsystem/I_Continuation.h:146
      #16 HttpCacheSM::state_cache_open_read (this=0x2b12bc109d88, event=<value
      optimized out>, data=0x2b13240e2190) at HttpCacheSM.cc:118
      #17 0x00000000005f0a93 in handleEvent (this=0x2b13240e2190, event=<value
      optimized out>) at ../../iocore/eventsystem/I_Continuation.h:146
      #18 CacheVC::callcont (this=0x2b13240e2190, event=<value optimized out>) at
      ../../iocore/cache/P_CacheInternal.h:666
      #19 0x000000000065934f in CacheVC::openReadStartHead (this=0x2b13240e2190,
      event=3900, e=0x0) at CacheRead.cc:1193
      #20 0x0000000000634b7d in handleEvent (this=0x2b13240e2190, event=<value
      optimized out>, e=<value optimized out>)
          at ../../iocore/eventsystem/I_Continuation.h:146
      #21 CacheVC::handleReadDone (this=0x2b13240e2190, event=<value optimized out>,
      e=<value optimized out>) at Cache.cc:2257
      #22 0x00000000005f0bf5 in handleEvent (this=<value optimized out>, event=<value
      optimized out>, data=<value optimized out>)
          at ../../iocore/eventsystem/I_Continuation.h:146
      #23 AIOCallbackInternal::io_complete (this=<value optimized out>, event=<value
      optimized out>, data=<value optimized out>) at ../../iocore/aio/P_AIO.h:123
      #24 0x00000000006a89bf in handleEvent (this=0x2b11e0404010, e=0x2b12cc08afe0,
      calling_code=1) at I_Continuation.h:146
      #25 EThread::process_event (this=0x2b11e0404010, e=0x2b12cc08afe0,
      calling_code=1) at UnixEThread.cc:141
      #26 0x00000000006a953b in EThread::execute (this=0x2b11e0404010) at
      UnixEThread.cc:192
      #27 0x00000000006a785a in spawn_thread_internal (a=0x10a6e20) at Thread.cc:88
      #28 0x00002b11daee9851 in start_thread () from /lib64/libpthread.so.0
      #29 0x00000038174e890d in clone () from /lib64/libc.so.6
      

      The code where crash happens is as follows. It's due to trying to access t_state.current.server, which is NULL at some conditions. Here we do not check for NULL pointer, so I think this means one of the following:

      1. t_state.current.server should not be NULL, we can add assert here.
      2. OR t_state.current.server could be NULL, we should add check here, and maybe some additional handle.
        I'm not familiar with Http State Machine's code, could some one help point out which is the right meaning? It would be appreciate if some one can comment for this or the potential root causes. Thank you!
        proxy/http/HttpSM.cc
        // void HttpSM::release_server_session()
        //
        //  Called when we are not tunneling a response from the
        //   server.  If the session is keep alive, release it back to the
        //   shared pool, otherwise close it
        //
        void
        HttpSM::release_server_session(bool serve_from_cache)
        {
          if (server_session != NULL) {
            if (t_state.current.server->keep_alive == HTTP_KEEPALIVE &&
        

      Attachments

        1. TS-2848.diff
          0.6 kB
          Feifei Cai

        Activity

          People

            amc Alan M. Carroll
            ffcai Feifei Cai
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: