Uploaded image for project: 'Traffic Server'
  1. Traffic Server
  2. TS-4816

ATS 6.2.0 - crashing with broken pipe, sig11 segmentation fault

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 7.1.0
    • Component/s: Manager
    • Labels:

      Description

      Hi,

      We just upgraded to ATS 6.2.0 via FreeBSD ports:

      [root@<machine> ~]# uname -a
      FreeBSD <machine> 10.3-RELEASE-p7 FreeBSD 10.3-RELEASE-p7 #0: Thu Aug 11 18:38:15 UTC 2016     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64
      [root@<machine> ~]# pkg info | grep traff
      trafficserver-6.2.0            Fast, scalable and extensible HTTP proxy server
      

      We are experiencing crashes, usually during the day, hardly any during "low" loads, but the mest affecting crashes occur in the early mornings. Along with this we can see a memory leak aswell
      We are using ATS as an enterprise proxy to the Internet, and as we have a very good Internet-connection we have also disabled caching.

      I'm not sure how I would attach files so here goes
      manager.log

      [Sep  2 11:48:28.017] Manager {0x804006400} ERROR: [LocalManager::sendMgmtMsgToProcesses] Error writing message
      [Sep  2 11:48:28.017] Manager {0x804006400} ERROR: <MgmtUtils.cc:289 (mgmt_elog)>  (last system error 32: Broken pipe)
      [Sep  2 11:48:38.305] {0x804006400} STATUS: opened /var/log/trafficserver/manager.log
      [Sep  2 11:48:38.305] {0x804006400} NOTE: <DiagsConfig.cc:141 (reconfigure_diags)> updated diags config
      [Sep  2 11:48:38.311] Manager {0x804006400} NOTE: [ClusterCom::ClusterCom] Node running on OS: 'FreeBSD' Release: '10.3-RELEASE-p7'
      [Sep  2 11:48:38.312] Manager {0x804006400} NOTE: [LocalManager::listenForProxy] Listening on port: 8080 (IPv4)
      [Sep  2 11:48:38.313] Manager {0x804006400} NOTE: [LocalManager::listenForProxy] Listening on port: 8080 (IPv6)
      [Sep  2 11:48:38.313] Manager {0x804006400} NOTE: [TrafficManager] Setup complete
      [Sep  2 11:48:39.321] Manager {0x804006400} NOTE: [LocalManager::startProxy] Launching ts process
      [Sep  2 11:48:39.336] Manager {0x804006400} NOTE: [LocalManager::pollMgmtProcessServer] New process connecting fd '17'
      [Sep  2 11:48:39.336] Manager {0x804006400} NOTE: [Alarms::signalAlarm] Server Process born
      [Sep  2 11:51:32.574] Manager {0x804006400} ERROR: [LocalManager::sendMgmtMsgToProcesses] Error writing message
      [Sep  2 11:51:32.574] Manager {0x804006400} ERROR: <MgmtUtils.cc:289 (mgmt_elog)>  (last system error 32: Broken pipe)
      [Sep  2 11:51:32.669] Manager {0x804006400} ERROR: [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 11: Segmentation fault
      [Sep  2 11:51:32.669] Manager {0x804006400} ERROR: [Alarms::signalAlarm] Server Process was reset
      [Sep  2 11:51:33.674] Manager {0x804006400} NOTE: [LocalManager::startProxy] Launching ts process
      [Sep  2 11:51:33.689] Manager {0x804006400} NOTE: [LocalManager::pollMgmtProcessServer] New process connecting fd '13'
      [Sep  2 11:51:33.690] Manager {0x804006400} NOTE: [Alarms::signalAlarm] Server Process born
      [Sep  3 04:14:35.380] Manager {0x804006400} ERROR: [LocalManager::sendMgmtMsgToProcesses] Error writing message
      [Sep  3 04:14:35.380] Manager {0x804006400} ERROR: <MgmtUtils.cc:289 (mgmt_elog)>  (last system error 32: Broken pipe)
      [Sep  3 04:14:35.748] Manager {0x804006400} ERROR: [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 11: Segmentation fault
      [Sep  3 04:14:35.748] Manager {0x804006400} ERROR: [Alarms::signalAlarm] Server Process was reset
      [Sep  3 04:14:36.814] Manager {0x804006400} NOTE: [LocalManager::startProxy] Launching ts process
      [Sep  3 04:14:36.828] Manager {0x804006400} NOTE: [LocalManager::pollMgmtProcessServer] New process connecting fd '13'
      [Sep  3 04:14:36.829] Manager {0x804006400} NOTE: [Alarms::signalAlarm] Server Process born
      

      traffic.out - since this isnt timestamped I'm not sure if I'm leaving some of the stacktrace out:

      traffic_server[TrafficManager] ==> Cleaning up and reissuing signal #15
      : Terminated
      traffic_server: Terminatedtraffic_servertraffic_servertraffic_server: Segmentation fault
      traffic_server - STACK TRACE:
      0x4af409 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at /usr/local/bin/traffic_server
      0x802735b37 <pthread_sigmask+0x507> at /lib/libthr.so.3
      0x80273522c <pthread_getspecific+0xe1c> at /lib/libthr.so.3
      getpeereid -> 0 (54, Connection reset by peer)[TrafficManager] ==> Cleaning up and reissuing signal #15
      traffic_server: Terminated
      traffic_server: Terminated
      traffic_server: using root directory '/usr/local'
      [TrafficManager] ==> signal #15
      traffic_server: Segmentation fault
      traffic_server - STACK TRACE:
      0x4af409 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at /usr/local/bin/traffic_server
      0x802735b37 <pthread_sigmask+0x507> at /lib/libthr.so.3
      0x80273522c <pthread_getspecific+0xe1c> at /lib/libthr.so.3
      traffic_server: Segmentation fault
      traffic_server - STACK TRACE:
      0x4af409 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at /usr/local/bin/traffic_server
      0x802735b37 <pthread_sigmask+0x507> at /lib/libthr.so.3
      0x80273522c <pthread_getspecific+0xe1c> at /lib/libthr.so.3
      traffic_server: Terminated
      traffic_server: Terminated
      traffic_server: Terminated
      traffic_server: Terminated
      traffic_server: Segmentation fault
      traffic_server - STACK TRACE:
      0x4af409 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at /usr/local/bin/traffic_server
      0x802735b37 <pthread_sigmask+0x507> at /lib/libthr.so.3
      0x80273522c <pthread_getspecific+0xe1c> at /lib/libthr.so.3
      traffic_server: Segmentation fault
      traffic_server - STACK TRACE:
      0x4af409 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at /usr/local/bin/traffic_server
      0x802735b37 <pthread_sigmask+0x507> at /lib/libthr.so.3
      0x80273522c <pthread_getspecific+0xe1c> at /lib/libthr.so.3
      traffic_server: Segmentation fault
      traffic_server - STACK TRACE:
      0x4af409 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at /usr/local/bin/traffic_server
      0x802735b37 <pthread_sigmask+0x507> at /lib/libthr.so.3
      0x80273522c <pthread_getspecific+0xe1c> at /lib/libthr.so.3
      traffic_server: Segmentation fault
      traffic_server - STACK TRACE:
      0x4af409 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at /usr/local/bin/traffic_server
      0x802735b37 <pthread_sigmask+0x507> at /lib/libthr.so.3
      0x80273522c <pthread_getspecific+0xe1c> at /lib/libthr.so.3
      traffic_server: Segmentation fault
      traffic_server - STACK TRACE:
      0x4af409 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at /usr/local/bin/traffic_server
      0x802735b37 <pthread_sigmask+0x507> at /lib/libthr.so.3
      0x80273522c <pthread_getspecific+0xe1c> at /lib/libthr.so.3
      

      /var/log/messages

      Sep  2 06:58:03 <machine> traffic_manager[5604]: {0x804006400} ERROR: [LocalManager::sendMgmtMsgToProcesses] Error writing message
      Sep  2 06:58:03 <machine> kernel: pid 6680 (traffic_server), uid 80: exited on signal 11
      Sep  2 06:58:03 <machine> traffic_manager[5604]: {0x804006400} ERROR: <MgmtUtils.cc:289 (mgmt_elog)>  (last system error 32: Broken pipe)
      Sep  2 06:58:04 <machine> traffic_cop[5603]: cannot find traffic_server [1]
      Sep  2 06:58:04 <machine> traffic_manager[5604]: {0x804006400} ERROR: [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 11: Segmentation fault
      Sep  2 06:58:04 <machine> traffic_manager[5604]: {0x804006400} ERROR: [Alarms::signalAlarm] Server Process was reset
      Sep  2 06:58:08 <machine> traffic_server[9951]: NOTE: --- traffic_server Starting ---
      Sep  2 06:58:08 <machine> traffic_server[9951]: NOTE: traffic_server Version: Apache Traffic Server - traffic_server - 6.2.0 - (build # 083112 on Aug 31 2016 at 12:51:58)
      Sep  2 06:58:08 <machine> traffic_server[9951]: NOTE: RLIMIT_NOFILE(8):cur(190297),max(190297)
      

      Every other second /var/log/messages is also getting 1-10 lines of this:

      Sep  3 17:48:50 <machine> traffic_server[14338]: {0x804008000} ERROR: <HttpSM.cc:1159 (state_raw_http_server_open)> [HttpSM::state_raw_http_server_open] event: EVENT_INTERVAL state: 0 server_entry: 0x0
      

      And a "ps aux" showing mem usage:

      [root@<machine> /usr/local/etc/trafficserver]# ps axu | grep "USER\|traff"
      USER      PID  %CPU %MEM     VSZ     RSS TT  STAT STARTED        TIME COMMAND
      www     14338   7.9 21.3 1910932 1778200  -  S     4:14AM    27:51.59 /usr/local/bin/traffic_server -M --bind_stdout /var/log/trafficserver/traffic.out --bind_stderr /var/log/traff
      root     5602   0.0  0.0   14492    2004  -  Is   Thu10AM     0:00.00 daemon: /usr/local/bin/traffic_cop[5603] (daemon)
      root     5603   0.0  0.1   64360    7516  -  Ss   Thu10AM     0:07.96 /usr/local/bin/traffic_cop
      www     10897   0.0  0.2   87544   13492  -  S    Fri11AM     0:34.78 /usr/local/bin/traffic_manager --bind_stdout /var/log/trafficserver/traffic.out --bind_stderr /var/log/traffic
      

        Attachments

        1. crash-2016-09-05-075309.log
          63 kB
          David Brodin
        2. ats_gdb-160905.txt
          6 kB
          David Brodin
        3. est_socks-ats_6.2.0.png
          111 kB
          David Brodin
        4. ats_manager-gdb-160906.txt
          4 kB
          David Brodin
        5. gdb-traffic_server-160909.txt
          3 kB
          David Brodin

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              durd David Brodin
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated: