Bug 48470

Summary: Tomcat hangs while stoppping
Product: Tomcat 6 Reporter: Turks <pflahrty>
Component: ConnectorsAssignee: Tomcat Developers Mailing List <dev>
Status: RESOLVED FIXED    
Severity: normal CC: marcin.balcer
Priority: P2    
Version: 6.0.20   
Target Milestone: default   
Hardware: PC   
OS: Windows Vista   
Attachments: Proposed patch

Description Turks 2010-01-01 12:56:28 UTC
Tomcat 6.0.20 running as a service on 64 bit Windows 7 machine with a quad core
processor hangs sporadically when stopping the service.

This is consistent on a variety of similar machines we have in our development
lab. Tomcat 5.5.26 is rock solid while starting and stopping the service on same platforms. Definitely introduced in Tomcat 6 at some point.

I tried a variety of JDK's and it appears that the Java Version make no difference as it still hangs while trying to stop service.

Is this possibly fixed and has not been packaged into a new build yet.

Thanks
Comment 1 Konstantin Kolinko 2010-01-01 16:07:25 UTC
Please take two or more subsequent thread dumps from a "hung" Tomcat instance. Comparing them will show what threads are stuck and where.

Here is a FAQ article:

http://wiki.apache.org/tomcat/HowTo#How_do_I_obtain_a_thread_dump_of_my_running_webapp_.3F


> Is this possibly fixed and has not been packaged into a new build yet.

The users@ list archives are searchable, if you are looking for other reports of the same problem.  I do not remember any, though.
Comment 2 Mark Thomas 2010-01-11 06:41:15 UTC
Coincidently one of our customers saw a similar issue moving from 5.5.x to 6.0.x.

I can't provide the stack traces but I can provide the analysis. It looks Tomcat is being stopped under load. In these circumstances, the connection created in unlockAccept() in the endpoint may get stuck in the TCP backlog queue. Since the connection in unlockAccept() is created without a timeout, this causes the shut down to block forever.

Tomcat 7 already has a configurable timeout for unlockAccept. I will look at porting this to Tomcat 6.
Comment 3 Mark Thomas 2010-01-11 08:10:35 UTC
Created attachment 24827 [details]
Proposed patch

This patch addresses the potential for the connector shutdown to block when Tomcat is shut down under load.

It also ensures localhost is used consistently for unlockAccept() if no specific address is provided for the connector. This should be compatible with systems that use ipv4 and/or ipv6.
Comment 4 Mark Thomas 2010-01-11 08:13:03 UTC
The attached patch has been proposed for 6.0.x

Note the 5.5.x code is quite different in this area and the reports indicate that this issue affects 6.0.x but not 5.5.x.
Comment 5 Mark Thomas 2010-01-11 09:44:06 UTC
*** Bug 47670 has been marked as a duplicate of this bug. ***
Comment 6 Konstantin Kolinko 2010-01-11 17:19:28 UTC
attachment 24827 [details] patch looks good, though I have not tried to run it yet.

+1 to add "Socket unlock completed for:" debug message to AprEndpoint, like it is done in JIoEndpoint.


Regarding s.setSoLinger(true, 0):
I see that NioEndpoint of TC6 and all endpoint implementations of TC7 use
 s.setSoLinger(getSocketProperties().getSoLingerOn(), ...)

The default value of soLingerOn is based on Constants.DEFAULT_CONNECTION_LINGER constants (in o.a.coyote.http11 or in o.a.coyote.ajp) that is -1. Thus it will be false.

I think that s.setSoLinger(true, 0) should be used in unlockAccept() for its dummy connection in all implementations of endpoint. Though I have not tested it.
Comment 7 Mark Thomas 2010-01-13 02:30:43 UTC
The fix has been applied to 6.0.x and will be included in 6.0.23 onwards.