Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
Description
we have origin connections limits, the parameter proxy.config.http.server_max_connections is for whole connections, and the parameter proxy.config.http.origin_max_connections is for the specified connections per original server ip / host.
the http transaction is rescheduled / retried after 100ms when it's original connections exceed these limits.
these transactions will be rescheduled repeatly when the original server is very slow. the system worsens when these retries happen because the http transactions accumulate more and more.
I think we should cut off these transactions immediately other than retry after 100ms.
the rescheduling codes in proxy/http/HttpSM.cc as:
4581 if (sum >= t_state.http_config_param->server_max_connections) { 4582 ink_assert(pending_action == NULL); 4583 pending_action = eventProcessor.schedule_in(this, HRTIME_MSECONDS(100)); 4584 httpSessionManager.purge_keepalives(); 4585 return; 4586 } 4587 } 4588 // Check to see if we have reached the max number of connections on this 4589 // host. 4590 if (t_state.txn_conf->origin_max_connections > 0) { 4591 ConnectionCount *connections = ConnectionCount::getInstance(); 4592 4593 char addrbuf[INET6_ADDRSTRLEN]; 4594 if (connections->getCount((t_state.current.server->addr)) >= t_state.txn_conf->origin_max_connections) { 4595 DebugSM("http", "[%" PRId64 "] over the number of connection for this host: %s", sm_id, 4596 ats_ip_ntop(&t_state.current.server->addr.sa, addrbuf, sizeof(addrbuf))); 4597 ink_assert(pending_action == NULL); 4598 pending_action = eventProcessor.schedule_in(this, HRTIME_MSECONDS(100)); 4599 return; 4600 } 4601 }
Attachments
Issue Links
- is depended upon by
-
TS-3313 New World order for connection management and timeouts
- Closed
Hmmm, not sure I agree that we should "cut off" transactions immediately. The purpose here is to deal with spiky traffic, afaik, and giving error pages seems draconian. Maybe it could be a configuration options, with a default of 100ms, and setting it to 0 means "cut off" immediately. For someone running into serious problems, they could also bump it up to a higher value, say 1000ms.
If we made this overridable, a plugin could also be smart about this, and make the back off time dynamic. Start with 100ms, as we accumulate more and more, bump it up to towards some max number (say 2000ms), and after that, start killing transactions immediately. The point would be though that some basic capability is records.config'urable, and more advanced behavior like the above could be achieved via a plugin.
Thoughts?