Traffic Server
  1. Traffic Server
  2. TS-156

Enabling serve_stale_for crashes traffic_server with SRV records enabled crashes

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Invalid
    • Affects Version/s: None
    • Fix Version/s: 2.1.2
    • Component/s: None
    • Labels:
      None

      Description

      (This is moved from Y! Bugzilla ticket 2561190, originally posted by Leif Hedstrom on 2009-02-25):

      We've been testing 1.17.13 on a production box, and activated the new features that we've been adding. This causes TS
      to crash fairly frequently, and I've tracked this down to this configuration:

      CONFIG proxy.config.hostdb.serve_stale_for INT 60

      setting this back to 0 (which disabled this new feature) eliminates the crashes.

      Comment #5 (Leif Hedstrom):

      So, it seems this only triggers IF the SRV records feature is enabled as well. I.e.

      CONFIG proxy.config.srv_enabled INT 1

      I've examined the code for the serve_stale_for feature, and it seems good. However, the SRV records feature is all
      brand new code, and rather untested.

      Comment #4 (Leif Hedstrom):

      This is a more likely stack trace (thanks Bryan for reminding me about our fantastic stack trace feature that logs the
      traces into traffic.log):

      0x083BCD02:bin/traffic_server:ink_atomiclist_push
      0x082D158E:bin/traffic_server:probe(ProxyMutex*, INK_MD5&, char*, int, int, int, void*, bool, bool)
      0x082D1597:bin/traffic_server:probe(ProxyMutex*, INK_MD5&, char*, int, int, int, void*, bool, bool)
      0x081C9DD3:bin/traffic_server:HttpTransact::delete_srv_entry(HttpTransact::State*, int)
      0x081CB363:bin/traffic_server:HttpTransact::handle_response_from_server(HttpTransact::State*)
      0x081CB9CE:bin/traffic_server:HttpTransact::HandleResponse(HttpTransact::State*)
      0x081DFF7B:bin/traffic_server:HttpSM::call_transact_and_set_next_state(void (HttpTransact::State*))
      0x081E3D57:bin/traffic_server:HttpSM::handle_server_setup_error(int, void*)
      0x081E721A:bin/traffic_server:HttpSM::state_read_server_response_header(int, void*)
      0x081E69A2:bin/traffic_server:HttpSM::main_handler(int, void*)
      0x0832B69F:bin/traffic_server:UnixNetVConnection::net_read_io(NetHandler*, EThread*)
      0x08330AB5:bin/traffic_server:NetHandler::mainNetEvent(int, Event*)
      0x0834B9C2:bin/traffic_server:EThread::process_event(Event*, int)
      0x0834CAFE:bin/traffic_server:EThread::execute()
      0x0812CB83:bin/traffic_server:main
      0x00418DE3:/lib/tls/libc.so.6:__libc_start_main
      0x080D6CA1:bin/traffic_server:__gxx_personality_v0
      0x080D6CA1:bin/traffic_server:__gxx_personality_v0
      0x080D6CA1:bin/traffic_server:__gxx_personality_v0
      0x080D6CA1:bin/traffic_server:__gxx_personality_v0

        Activity

        Hide
        Eric Balsa added a comment -

        Sent 10M requests thru dev branch on OSX, and about 5M thru ubuntu9.10 dev build with the config options specified above in reverse proxy mode and could not reproduce. Either

        1. it's masked now
        2. it's fixed with john's changes
        3. it's still present and more difficult to reproduce.

        --Eric

        ab results:
        ===
        Concurrency Level: 200
        Time taken for tests: 495.155 seconds
        Complete requests: 10000000
        Failed requests: 0
        Write errors: 0
        Keep-Alive requests: 10000000
        Total transferred: 4390039071 bytes
        HTML transferred: 1510013439 bytes
        Requests per second: 20195.68 /sec (mean)
        Time per request: 9.903 [ms] (mean)
        Time per request: 0.050 [ms] (mean, across all concurrent requests)
        Transfer rate: 8658.19 [Kbytes/sec] received

        Connection Times (ms)
        min mean[+/-sd] median max
        Connect: 0 0 0.0 0 7
        Processing: 0 10 5.0 9 140
        Waiting: 0 10 5.0 9 140
        Total: 0 10 5.0 9 140

        Percentage of the requests served within a certain time (ms)
        50% 9
        66% 10
        75% 11
        80% 12
        90% 15
        95% 18
        98% 24
        99% 29
        100% 140 (longest request)

        Show
        Eric Balsa added a comment - Sent 10M requests thru dev branch on OSX, and about 5M thru ubuntu9.10 dev build with the config options specified above in reverse proxy mode and could not reproduce. Either 1. it's masked now 2. it's fixed with john's changes 3. it's still present and more difficult to reproduce. --Eric ab results: === Concurrency Level: 200 Time taken for tests: 495.155 seconds Complete requests: 10000000 Failed requests: 0 Write errors: 0 Keep-Alive requests: 10000000 Total transferred: 4390039071 bytes HTML transferred: 1510013439 bytes Requests per second: 20195.68 /sec (mean) Time per request: 9.903 [ms] (mean) Time per request: 0.050 [ms] (mean, across all concurrent requests) Transfer rate: 8658.19 [Kbytes/sec] received Connection Times (ms) min mean [+/-sd] median max Connect: 0 0 0.0 0 7 Processing: 0 10 5.0 9 140 Waiting: 0 10 5.0 9 140 Total: 0 10 5.0 9 140 Percentage of the requests served within a certain time (ms) 50% 9 66% 10 75% 11 80% 12 90% 15 95% 18 98% 24 99% 29 100% 140 (longest request)
        Hide
        Leif Hedstrom added a comment -

        I can't reproduce this now either, please reopen if needed.

        Show
        Leif Hedstrom added a comment - I can't reproduce this now either, please reopen if needed.

          People

          • Assignee:
            Unassigned
            Reporter:
            Leif Hedstrom
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development