If one server fails "badly" (I believe resulting in a socket time out error) the FastAsyncSocketSender is locked by a thread and causes a backlog on all subsequent http threads causing the entire machine to run out of sockets. Details below : Default cluster settings : <Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster" /> We have mutlipele web machines (6 of them). Something really bad happened at our data center (not sure what, cable fault, some dweeb tripped on our ethernet, don't quite know yet) causing one of our web servers to die. The rest of the machines then back logged trying to replicate to the dead machine, which caused all the web servers to fill up the max threads causing a site outtage. We took stack traces at the point in time where we had to restart the tomcat process, what I believe to be the relavent stack traces are included below. You can see one of the http threads (143) is trying to replicate synchronously (which I found odd using fastasynch but okay) I believe this thread is stuck on a 2 minute socket time out and currently holds a lock on FastAsych. Notice the Cluster-MembershipReceiver thread is waiting for the fastAsynch object and currently holds a lock on ReplicationTransmitter. Notice Http thread (147) is waiting on ReplicationTransmitter. As a result I have about 298 other Http threads all waiting on ReplicationTransmitter. I had 300 threads configured. Now I realised after a "while" the socket will time out and it'll all work itself out but our site was stuck in this mode for over 10 minutes so I think this is kind of a bug on the basis that 1 machine dying (albiet badly) shouldn't cause all other machines to backlog at all. ------------ "http-80-Processor143" daemon prio=1 tid=0x084ad748 nid=0x6953 runnable [0x7e7bf000..0x7e7bf63c] at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) at java.net.SocketOutputStream.write(SocketOutputStream.java:124) at org.apache.catalina.cluster.tcp.DataSender.writeData(DataSender.java:830) at org.apache.catalina.cluster.tcp.DataSender.pushMessage(DataSender.java:772) at org.apache.catalina.cluster.tcp.DataSender.sendMessage(DataSender.java:598) - locked <0x4e7864f8> (a org.apache.catalina.cluster.tcp.FastAsyncSocketSender) at org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageData(ReplicationTransmitter.java:868) at org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageClusterDomain(ReplicationTransmitter.java:460) at org.apache.catalina.cluster.tcp.SimpleTcpCluster.sendClusterDomain(SimpleTcpCluster.java:1017) at org.apache.catalina.cluster.tcp.ReplicationValve.sendSessionReplicationMessage(ReplicationValve.java:333) at org.apache.catalina.cluster.tcp.ReplicationValve.invoke(ReplicationValve.java:271) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107) at org.apache.catalina.valves.FastCommonAccessLogValve.invoke(FastCommonAccessLogValve.java:495) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:868) at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:663) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684) at java.lang.Thread.run(Thread.java:595) "Cluster-MembershipReceiver" daemon prio=1 tid=0x78804ad8 nid=0x661c waiting for monitor entry [0x786ff000..0x786ff73c] at org.apache.catalina.cluster.tcp.DataSender.disconnect(DataSender.java:560) - waiting to lock <0x4e7864f8> (a org.apache.catalina.cluster.tcp.FastAsyncSocketSender) at org.apache.catalina.cluster.tcp.FastAsyncSocketSender.disconnect(FastAsyncSocketSender.java:295) at org.apache.catalina.cluster.tcp.ReplicationTransmitter.remove(ReplicationTransmitter.java:689) - locked <0x4e7a4e68> (a org.apache.catalina.cluster.tcp.ReplicationTransmitter) at org.apache.catalina.cluster.tcp.SimpleTcpCluster.memberDisappeared(SimpleTcpCluster.java:1124) at org.apache.catalina.cluster.mcast.McastService.memberDisappeared(McastService.java:455) at org.apache.catalina.cluster.mcast.McastServiceImpl.receive(McastServiceImpl.java:221) at org.apache.catalina.cluster.mcast.McastServiceImpl$ReceiverThread.run(McastServiceImpl.java:253) "http-80-Processor147" daemon prio=1 tid=0x084b1208 nid=0x6957 waiting for monitor entry [0x7e8bf000..0x7e8bf83c] at org.apache.catalina.cluster.tcp.ReplicationTransmitter.addStats(ReplicationTransmitter.java:702) - waiting to lock <0x4e7a4e68> (a org.apache.catalina.cluster.tcp.ReplicationTransmitter) at org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageData(ReplicationTransmitter.java:870) at org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageClusterDomain(ReplicationTransmitter.java:460) at org.apache.catalina.cluster.tcp.SimpleTcpCluster.sendClusterDomain(SimpleTcpCluster.java:1017) at org.apache.catalina.cluster.tcp.ReplicationValve.sendSessionReplicationMessage(ReplicationValve.java:333) at org.apache.catalina.cluster.tcp.ReplicationValve.invoke(ReplicationValve.java:271) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107) at org.apache.catalina.valves.FastCommonAccessLogValve.invoke(FastCommonAccessLogValve.java:495) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:868) at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:663) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684) at java.lang.Thread.run(Thread.java:595)
Update to the current svn head. I have fixed a DataSender refactoring bug. This means that tomcat 5.5.12 cluster send all messages thread synchronized with only one sender socket. Other bug is that the membership message format has changed (s. Bug 37808) Can you please test with the newest cluster code? I reference your report now at changelog. I am not sure that your szenarion not show another problem! Is it true, that all your nodes are inside the same cluster replication domain? Sorry for the trouble and thanks for analyse and reporting the bug Peter
I'll try to get a copy of the new code, I'm not so sure I can test it though as the circumstances are quite particular in reproducing this problem and I can't really take down our live site to test it. I'll at least read through it to see if the blocking scenerio can take place. As for the node, yes I think so if I understand your question properly, we only have 1 node name, and 1 domain name, and 1 context which is distributable.
Okay I had a look at the new code and I'm pretty sure the problem will still manifest itself. The problem isn't with the code perse but I think it's a problem with the design of the replication. Since the replication is in-line with the HTTP request (seems like new session requests are inline even in asynch mode), if something bad happens to the replication socket which causes it to hang, it will block that thread (presumably until SO_TIMEOUT which is usually 2 minutes). If a subsequent request comes in for a new session, it will try to do the same thing but it needs to aquire a lock on the DataSender object via sendMessage(); therefore, this thread will now block until the previous thread timesout. As a result a lot of threads may backup while waiting for the original socket to timeout. Note that this could happen even in pooled mode, if all 300 threads had their own 300 replicator sockets, and each sat there waiting to timeout for 2 minutes, then there wuld be no more threads available (assuming I'm maxed at 300 threads). The only things I can really think of to resolve this is to have the replication on a separate thread to the http requests. Of course maybe I'm just mis-reading the code and everything I'm saying might be wrong...
Fast async has been designed to be used asynchronously. With such a configuration, no tcp communication for replication will be done during the request-response lifecycle. Instead all replication messages will only put into a local queue. Seperate threads will pick up these messages an send them. The queue will be locked while taking out the messages, but the lock will b freed before actually trying to send a message.
Hmm, when you don't want update complete to 5.5.15 (svn head) you can compile the cluster module and copy the resulting catalina-cluster.jar inside 5.5.12 release. The default cluster configuration normaly used fastasyncqueue, the problem is that the subclass not implement the right method. Now is fixed and all is working well. Peter