Traffic Server
  1. Traffic Server
  2. TS-441

Errors and crashes reported on the cache

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 2.1.2
    • Fix Version/s: 2.1.3
    • Component/s: Cache
    • Labels:
      None

      Description

      This is from Pranav Desai:

      I am running a load test with some video files to see . I am using
      curl-loader to generate the load. I have modified it to add a random
      number to the URLs before sending so I can test with a single URL and
      still stress the cache. The webserver is a lighttpd server with
      rewrite rules to translate the random strings back to a common URL.
      The URL is essentially a 15MB video file. I can provide more details
      on the setup if needed.

      Here are the errors I see in /var/log/messages
      Sep 16 12:05:31 c1b14 traffic_server[30008]:

      {1095895360}

      NOTE:
      OpenReadHead failed for cachekey F99D907A : vector inconsistency with
      2408
      Sep 16 12:05:33 c1b14 traffic_server[30008]:

      {1077999936}

      NOTE:
      OpenReadHead failed for cachekey 8CEE75D9 : vector inconsistency with
      2416
      Sep 16 12:05:33 c1b14 traffic_server[30008]:

      {1079052608}

      NOTE:
      OpenReadHead failed for cachekey 2DC0CAB : vector inconsistency with
      2416
      Sep 16 12:05:34 c1b14 traffic_server[30008]:

      {1083263296} NOTE:
      OpenReadHead failed for cachekey 21712A98 : vector inconsistency with
      2416
      Sep 16 12:05:36 c1b14 traffic_server[30008]: {1105369408} NOTE:
      OpenReadHead failed for cachekey FFD8902 : vector inconsistency with
      2416
      Sep 16 12:05:36 c1b14 traffic_server[30008]: {1074841920} NOTE:
      OpenReadHead failed for cachekey C28AE2D3 : vector inconsistency with
      2416
      Sep 16 12:05:37 c1b14 traffic_server[30008]: {1083263296}

      NOTE:
      OpenReadHead failed for cachekey 38E6F4CE : vector inconsistency with
      2416
      ...
      lots of them.

      Sep 16 12:05:53 c1b14 traffic_server[30008]:

      {1102211392}

      NOTE:
      OpenReadHead failed for cachekey 160CEE08 : vector inconsistency with
      2416
      Sep 16 12:05:55 c1b14 traffic_manager[29998]:

      {139744257029936} FATAL:
      [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
      Sep 16 12:05:55 c1b14 traffic_manager[29998]: {139744257029936}

      FATAL:
      (last system error 104: Connection reset by peer)
      Sep 16 12:05:55 c1b14 traffic_manager[29998]:

      {139744257029936} NOTE:
      [LocalManager::mgmtShutdown] Executing shutdown request.
      Sep 16 12:05:55 c1b14 traffic_manager[29998]: {139744257029936}

      NOTE:
      [LocalManager::processShutdown] Executing process shutdown request.
      Sep 16 12:05:55 c1b14 traffic_manager[29998]:

      {139744257029936} ERROR:
      [LocalManager::sendMgmtMsgToProcesses] Error writing message
      Sep 16 12:05:55 c1b14 traffic_manager[29998]: {139744257029936}

      ERROR:
      (last system error 32: Broken pipe)
      Sep 16 12:05:55 c1b14 traffic_cop[29996]: cop received child status
      signal [29998 2816]
      Sep 16 12:05:55 c1b14 traffic_cop[29996]: traffic_manager not running,
      making sure traffic_server is dead
      Sep 16 12:05:55 c1b14 traffic_cop[29996]: spawning traffic_manager
      Sep 16 12:05:55 c1b14 traffic_manager[30116]: NOTE: — Manager Starting —

      I dont know if its generating a core, atleast I couldnt find one, it
      does have a bunch of these messages in traffic.out. I can send the
      traffic.out file if it helps. I do have ulimit set correctly on the
      shell that I run 'trafficserver start' from, but not sure if its
      needed within the script as well ?

      -------------- End header heap dump -----------
      WARNING: Unmarshal failed due to unknow obj type 121 after 776
      bytes---- Dumping header heap @ 0x2aaab7284808 - len 1209 ------
      0x2aaab7284808: 0xdcbafeed 0xa1e63532 0xb7284b80 0x2aaa
      0x2aaab7284818: 0xb7284890 0x2aaa 0x378 0x907cbe00
      0x2aaab7284828: 0x0 0x0 0x0 0x1fab0054
      0x2aaab7284838: 0x0 0x0 0x1c01430 0x0
      0x2aaab7284848: 0xb7284b80 0x2aaa 0x141 0xd48914a0
      0x2aaab7284858: 0x1b564038 0x816a7c80 0x0 0x0
      0x2aaab7284868: 0x7e29048b 0x6fbc0a8e 0x5b1c996 0x4daab81d

      I dont see this behavior with 2.0.1.

      Let me know if you need more information.

      Thanks
      – Pranav

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            John Plevyak
            Reporter:
            Leif Hedstrom
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development