Bug 36827 - Need an option to severe socket connections between mod_jk and ajp connector after request/response cycle.
Summary: Need an option to severe socket connections between mod_jk and ajp connector ...
Status: RESOLVED INVALID
Alias: None
Product: Tomcat Connectors
Classification: Unclassified
Component: Common (show other bugs)
Version: unspecified
Hardware: All Linux
: P2 major (vote)
Target Milestone: ---
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-09-27 16:01 UTC by Remy Gendron
Modified: 2008-10-05 03:09 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Remy Gendron 2005-09-27 16:01:41 UTC
Here’s the situation as it stands today and what can be done to solve it. I’ll 
try to keep this short.

Running configuration:

•	Running on Linux Red-Hat Ent 3
•	1 X F5 load balancer and hardware SSL box.
•	5 X Apache 1.3.33/mod_jk 1.2.14
•	6 X JBoss 4.0.0/Tomcat 5.0.28 using the AJP13 connector. 
•	Oracle 9i

Our production environment hosts a number of applications, each with different 
load and usage patterns. Our problem comes from the fact that it is difficult 
to find a web farm configuration that will satisfy every application. For 
reasons I will not explain here, we cannot have a dedicated web farm for each 
application.

This is what we think is happening in our production environment based on 
tests ran in UAT (User Acceptance Tests) and literature from the Apache and 
Tomcat products. This is all pretty new to us so if someone can provide hard 
facts, you are more than welcome.

1.	The 1.3 generation of Apache web servers will spawn a child process to 
handle an HTTP request. Only one HTTP request at a time can be processed by 
that child. 
2.	As the load increases on the web server, additional child processes 
will be spawned to concurrently serve the requests. There is a default limit 
to how many child processes can be forked. That limit defaults to 256 but has 
been changed in production to 16384. This is the MaxClients directive. It 
seems that production really needs the 16384 value instead of the 256 default. 
With 256, our web servers were rejecting connections and could not support the 
load generated by all of our clients.
3.	To prevent latency, Apache will maintain a maximum of 100 spare child 
processes alive. Spare means that they are not serving requests. Once reached, 
that number of spare servers does not seem to decrease. This is the number we 
see in our tests in UAT where 201 threads remain active in Tomcat. This is the 
100 spare children connections * 2 web server plus accept() thread. 
4.	If a request needs to be forwarded to Tomcat/JBoss (dynamic pages), 
the child process mod_jk module will instantiate a socket connection to the 
ajp13 connector in Tomcat. 
5.	Tomcat will accept the connection and create a thread to serve it. 
Connections will be accepted up to a concurrent maximum of 1200. This upper 
value has been set by us. 
6.	Tomcat will reject connections when the maximum is reached. JBoss 
4.0.0 has a known issue where the server will die when the maximum is reached. 
This has been fixed in 4.0.2. 
7.	A connection could potentially be recycled in mod_jk (recycle_timeout) 
if no activity occurs thru the connection. However, any requests to Tomcat 
from any user session-bound to that Tomcat instance could go thru the 
connection, thus keeping it active. Recycling does not seem to occur. We use a 
recycle_timeout value of 300.
8.	The fact that the production web servers can potentially serve up to 
16384 concurrent requests make it possible for a web server to instantiate an 
almost infinite number of connections to Tomcat and nuke it. 
9.	Tomcat can then become overloaded with connections. If a valid HTTP 
request comes thru Apache and is routed to a child process that has not yet 
made a connection to Tomcat, the connection could be impossible if Tomcat has 
already accepted its 1200 limit. 
10.	In that case, mod_jk could potentially fail over to another Tomcat. 
The user would however loose his session.
11.	The recycle_timeout and  cache_size options are of no use to us 
because too many web server children are created to serve the company load. 
Thus, many different routes can be taken by requests targeted to our 
application, keeping all the connection alive.
12.	We tried really small recycle_timeout values (e.g. 20) with no effect. 
A netstat reveals that connections remain ESTABLISHED. 
13.	The maxRequestsPerChild setting is set to 0 in PROD. It means that 
Apache child processes will never die, unless you reach the maxSpareServers 
value. Thus, at least 100 connections per web server will always remain 
actively connected to Tomcat. A > 0 value would at least guarantee that a 
child process would eventually die, freeing Tomcat connections and releasing 
back leaked memory to the OS. 

It’s hard to see a path out of this one.

One solution would be to reduce the MaxClients Apache config back to 256. This 
would mean that a single instance of Tomcat would not be hit by more than 256 
* 5 = 1280 (5 is the web farm size) connections. Our current jvm settings 
(heap + thread stack sizes) would allow us to do it. We would also need to 
bump our current 1200 limit a bit higher. However, this solution if not 
compatible with other applications which have really high loads.

Second option would be to patch mod_jk so that connections are dropped as soon 
as the response has been received from Tomcat. Drawbacks include preventing us 
from upgrading to new releases (unless we re-apply the modifications), 
introduce the risk of breaking something in this add-on, concentrate knowledge 
in the head of the person making the changes, introduce yet another component 
for the prod people to know and manage. The overhead of a connection is 
probably quite small but would need to be validated.

Finally, having our own web farm would be another solution. However, this goes 
against Production master plan of having only one web farm for production.
Comment 1 Rainer Jung 2005-09-27 16:59:03 UTC
Please witch over to tomcat-user for discussion. This is not a bug.

First hints from my side: reduce to equal number of apache and tomcat instances,
configure F5 with rule that sends URLs with session cookie or jsessionid in URL
to the "correct" apache. Furthermore configure mod_jk, such that each apache
sends requests without sessions to it's preferred tomcat partner.

That way almost all apache processes will connect to only one tomcat.

If you still need 16K apache processes per instance you are in trouble (maybe
upgrade to apache 2), if you manage to handle the workload with 1K apache
processes, 1K Threads in Tomcat should be OK.
Comment 2 Yoav Shapira 2005-10-13 06:42:37 UTC
As Rainer noted...