Description
Recent tests prove that while using accept thread, we do end up waiting for a period of time before handling the request from the client.
It happens every 2 minutes if accept_thread is enabled with 1+ seconds delayed processing (accept -> 1+ seconds -> ats session born)
If accept_thread is disabled, it works fine.
Our test environment:
client -> ats (ats-4.0.2_77 or ats-5.0.1_12) -> os (yapache, keep-alive off)
records.config
CONFIG proxy.config.http.cache.http INT 0
https://git.corp.yahoo.com/gist/sho/a23cf06a626bfa5d2041
plugin.config
EMPTY
remap.config
map http://scheduler.nevec.yahoo.com:4080/ http://b-sched1.ec.tw1.yahoo.com:4080/
Suspected tcpdump result:
Client to ATS
01:44:42.448130 IP 10.236.18.241.49306 > 10.236.3.123.4080: Flags [P.], seq 1:822, ack 1, win 57, options [nop,nop,TS val 3868640125 ecr 2513718106], length 821
E..i(?@.=...
...
..
..KZGET /v1/scheduler/bid/job/187357436?rand=33918061510469269371075356787 HTTP/1.1^M
Accept: /^M
host:scheduler.nevec.yahoo.com:4080^M
…
ATS to Origin Server
01:44:44.642019 IP 10.236.3.123.44097 > 10.236.18.254.4080: Flags [P.], seq 1:907, ack 1, win 57, options [nop,nop,TS val 2513722222 ecr 2932310262], length 906
E...G.@.@...
..{
....A..$...d......90......
..[n....GET /v1/scheduler/bid/job/187357436?rand=33918061510469269371075356787 HTTP/1.1^M
Accept: /^M
host:scheduler.nevec.yahoo.com:4080^M
Suspected debug log:
[Aug 21 01:44:44.355] Server
[Aug 21 01:44:44.357] Server {0x2b7527b80700}
DEBUG: (http_cs) [40822] Starting transaction 1 using sm [40822]
[Aug 21 01:44:44.498] Server
DEBUG: (http_ss) [40822] session born, netvc 0x2b753c022590
[Aug 21 01:44:44.627] Server
DEBUG: (http)
+++++++++ Incoming Request +++++++++
– State Machine Id: 40822
GET http://b-sched1.ec.tw1.yahoo.com:4080/v1/scheduler/bid/job/187357436?rand=33918061510469269371075356787 HTTP/1.1^M
Accept: /^M
host:scheduler.nevec.yahoo.com:4080^M
Bryan Call says - Besides the delay between the accept thread and the net thread, it looks like the client session is on one thread and the server session is on another. There might be some issue with a delay with running with a global server session pool.
Attachments
Issue Links
- is related to
-
TS-1951 Disabling accept thread causes high amount of kernel spin-lock
-
- Open
-
-
TS-3714 TS seems to stall between ua_begin and ua_first_read on some transactions resulting in high response times.
-
- Closed
-
-
TS-329 delay seen in handling the client's request while running with accept thread
-
- Closed
-