after running tomcat for a while, all of a sudden browsers keep waiting forever after connecting. With the links-linemode browser, it appears that the reason for it surfaced: it hangs during "SSL negotiation" When stopping tomcat manually, I get <<<<DEBUG [http-8080-Processor4] (StandardWrapper.java:700) - Returning non-STM instance Jul 3, 2004 3:52:45 PM org.apache.coyote.http11.Http11Protocol pause INFO: Pausing Coyote HTTP/1.1 on http-8443 Jul 3, 2004 3:52:46 PM org.apache.catalina.core.StandardService stop INFO: Stopping service Privasphere-Standalone Jul 3, 2004 3:52:46 PM org.apache.catalina.core.StandardHost getDeployer INFO: Create Host deployer for direct deployment ( non-jmx ) Jul 3, 2004 3:52:46 PM org.apache.catalina.core.StandardHostDeployer remove INFO: Removing web application at context path DEBUG [main] (StandardManager.java:676) - Stopping DEBUG [main] (StandardManager.java:500) - Unloading persisted sessions DEBUG [main] (StandardManager.java:507) - Saving persisted sessions to SESSIONS.ser DEBUG [main] (StandardManager.java:530) - Unloading 0 sessions DEBUG [main] (StandardManager.java:574) - Expiring 0 persisted sessions DEBUG [main] (StandardManager.java:586) - Unloading complete Jul 3, 2004 3:52:46 PM org.apache.catalina.core.StandardHost getDeployer INFO: Create Host deployer for direct deployment ( non-jmx ) Jul 3, 2004 3:52:46 PM org.apache.catalina.core.StandardHostDeployer remove INFO: Removing web application at context path DEBUG [main] (StandardManager.java:676) - Stopping DEBUG [main] (StandardManager.java:500) - Unloading persisted sessions DEBUG [main] (StandardManager.java:507) - Saving persisted sessions to SESSIONS.ser DEBUG [main] (StandardManager.java:530) - Unloading 8 sessions INFO [main] (ManagerBase.java:898) - Cannot serialize session attribute logoPrepaid for session EB7C9AF5C1736B2D04765F4DB2BE2E 67 java.io.NotSerializableException: java.io.ByteArrayInputStream at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1054) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1332) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1304) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1247) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1052) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:278) at org.apache.catalina.session.StandardSession.writeObject(StandardSession.java:1411) at org.apache.catalina.session.StandardSession.writeObjectData(StandardSession.java:899) at org.apache.catalina.session.StandardManager.doUnload(StandardManager.java:539) at org.apache.catalina.session.StandardManager.unload(StandardManager.java:485) at org.apache.catalina.session.StandardManager.stop(StandardManager.java:687) at org.apache.catalina.core.StandardContext.stop(StandardContext.java:4474) at org.apache.catalina.core.ContainerBase.removeChild(ContainerBase.java:952) at org.apache.catalina.core.StandardHostDeployer.remove(StandardHostDeployer.java:670) at org.apache.catalina.core.StandardHost.remove(StandardHost.java:946) at org.apache.catalina.startup.HostConfig.undeployApps(HostConfig.java:1009) at org.apache.catalina.startup.HostConfig.stop(HostConfig.java:986) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:351) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119) at org.apache.catalina.core.ContainerBase.stop(ContainerBase.java:1123) at org.apache.catalina.core.ContainerBase.stop(ContainerBase.java:1135) at org.apache.catalina.core.StandardEngine.stop(StandardEngine.java:483) at org.apache.catalina.core.StandardService.stop(StandardService.java:538) at org.apache.catalina.core.StandardServer.stop(StandardServer.java:2332) at org.apache.catalina.startup.Catalina.stop(Catalina.java:605) at org.apache.catalina.startup.Catalina.start(Catalina.java:580) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:324) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:284) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:422) INFO [main] (ManagerBase.java:898) - Cannot serialize session attribute payRecip for session EB7C9AF5C1736B2D04765F4DB2BE2E67 java.io.NotSerializableException: java.io.ByteArrayInputStream at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1054) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1332) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1304) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1247) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1052) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1332) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1304) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1247) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1052) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:278) at org.apache.catalina.session.StandardSession.writeObject(StandardSession.java:1411) at org.apache.catalina.session.StandardSession.writeObjectData(StandardSession.java:899) at org.apache.catalina.session.StandardManager.doUnload(StandardManager.java:539) at org.apache.catalina.session.StandardManager.unload(StandardManager.java:485) at org.apache.catalina.session.StandardManager.stop(StandardManager.java:687) at org.apache.catalina.core.StandardContext.stop(StandardContext.java:4474) at org.apache.catalina.core.ContainerBase.removeChild(ContainerBase.java:952) at org.apache.catalina.core.StandardHostDeployer.remove(StandardHostDeployer.java:670) at org.apache.catalina.core.StandardHost.remove(StandardHost.java:946) at org.apache.catalina.startup.HostConfig.undeployApps(HostConfig.java:1009) at org.apache.catalina.startup.HostConfig.stop(HostConfig.java:986) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:351) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119) at org.apache.catalina.core.ContainerBase.stop(ContainerBase.java:1123) at org.apache.catalina.core.ContainerBase.stop(ContainerBase.java:1135) at org.apache.catalina.core.StandardEngine.stop(StandardEngine.java:483) at org.apache.catalina.core.StandardService.stop(StandardService.java:538) at org.apache.catalina.core.StandardServer.stop(StandardServer.java:2332) at org.apache.catalina.startup.Catalina.stop(Catalina.java:605) at org.apache.catalina.startup.Catalina.start(Catalina.java:580) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:324) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:284) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:422) DEBUG [main] (StandardManager.java:574) - Expiring 8 persisted sessions DEBUG [main] (StandardManager.java:586) - Unloading complete INFO [main] (StandardWrapper.java:1213) - Waiting for 2 instance(s) to be deallocated INFO [main] (StandardWrapper.java:1213) - Waiting for 2 instance(s) to be deallocated INFO [main] (StandardWrapper.java:1213) - Waiting for 2 instance(s) to be deallocated java.lang.OutOfMemoryError ...>> other services with non-https connectors were not affected. My suspicion is that the insufficient memory was the root cause. If so, it would be great to throw the OutOfMemoryError immediately.
- Mozilla hangs after saying "Connected to hostname.domain.tld..." in the status bar without a timeout (http://bugzilla.mozilla.org/show_bug.cgi?id=249976) - MSIE 6.0.2800.1106.xpsp2.030422-1633 says in the status bar "opening page https://hostname.domain.tld:port/..." - no timeout - Netscape 4.7: "Connect: host hostname.domain.tld:8443 contacted - waiting for reply" - no timeout - redhat 9 ELinks 0.4.2 - Text WWW browser: "SSL negotiation" and no timeout - redhat 9 Links 0.96 - "SSL negotiation" and no timeout - Opera 7.50K 'Sende Anfrage an hostname.domain.tld...' if anybody has a recommendation which browser to use to learn where it breaks down, this would be highly appreciated!
Links 0.96: gives a late timeout "Error Receive timeout [cancel]"
I really don't see anything to fix here (if you think otheriwse, please propose a patch). You have to remember Tomcat isn't implementing SSL (JSSE does that).
The following is one hypothesis for this, but we have seen it also happen on non-ssl connectors (i.e. regular http) - anyway, here the reasoning goes as per http://forum.java.sun.com/wireless/thread.jsp?forum=2&thread=502002&message=2374415#2374415 === We had exactly the same problem and it appeared to be a problem with /dev/random. It seems that if there's not enough entropy, then the usage of /dev/random might block waiting for entropy. You propably don't have enough key pressings, mouse usage, net traffic, etc. for the entropy required to be filled. You can set the system property "java.security.egd" to "file:/dev/urandom" to use /dev/urandom instead. /dev/urandom is secure enough for SSL connection usage. Hope this helps you! - Atso. === see also: http://lists.gnupg.org/pipermail/gnupg-devel/2003-February/019736.html
more insights on this - forget the stacktrace in original description, it is probably our watcher script, that automatically restarted tomcat (and I didn't know about this) However: - the problem that a connector hangs happens with all 3 connectors I have configured in my server.xml (8443 with https, 8080 and 2712 with plain http) - what is still possible is to send tomcat a kill -QUIT to obtain a full thread dump and nothing appears to be particularly special since the server is not very busy. However, for each connector I see up to minSpareThreads="25" threads and always one in java.net.PlainSocketImpl.socketAccept() and the rest in java.lang.Object.wait(). This is the healthy situation. Once tomcat hangs, on 1 or more of the connectors, all the threads are in the wait state and none doing socketAccept. In the logs, there is no indication why the connector should loose the accepting thread. The only stacktraces I see come from the struts file-download when, somebody aborts it and other simple errors in the servlet application (typically leading to ... at org.apache.struts.action.ActionServlet.doPost(ActionServlet.java:525) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at javax.servlet.http.HttpServlet.service(HttpServlet.java:810) ... at org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:929) at org.apache.coyote.tomcat5.CoyoteAdapter.service(CoyoteAdapter.java:160) ...) Any hints how to remedy this would be highly appreciated! I'll attach the JVM thread dump next. P.S.: Appears to have some similarity to the mailing-list description as per the above URL. P.P.S.: there is no out-of-memory situation before like in Bug 31426
Created attachment 12969 [details] threadDump3Connectors.txt
It seems that we have found the solution (keep the fingers crossed ;) ): Apparently, Redhat 9. was the first Redhat Release that uses the new Posix Threads (NPTL), backported by Redhat from 2.6 Kernel to 2.4 Unfortunately, this appears to be buggy (e.g. CommunigatePro also has not been "cleared" for redhat) How to fix: > mv /lib/tls /lib/tls.unused > ldconfig > catalina.sh restart --> this brings back the old, classic Linux-Thread Implementation - slower, but hopefully more robust. When doing less /proc/PID/maps no more shared objects from /lib/tls should be visible The affected system as per "uname -a": Linux myhost 2.4.20-30.9 #1 Wed Feb 4 20:44:26 EST 2004 i686 i686 i386 GNU/Linux Wild enhancement suggestion to prevent similar stability nightmares in the future: ======================================================================= (admitted, it is not really Tomcat's task to fix operating system bugs on the application level, but if it is easy why not doing it anyway) Idea how this could work: org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:616) synchronized (this) { while (!shouldRun && !shouldTerminate) { this.wait(); } _shouldRun = shouldRun; _shouldTerminate = shouldTerminate; _toRun = toRun; } Why could this not only wait to be notified about "work", but also perhaps wake up every minute once by itself (and if another thread is already working going immediately back to sleep since it cannot get the synchronized object.) If that was possible, a big warning should be printed (provided one can detect whether the thread woke up by itself instead of being notified) that tomcat is covering up for underlying OS problems and the Sys-Admin better get informed about possible patch... What do you think?
the same appears to have happened also with http://issues.apache.org/jira/browse/JAMES-324