we got this java.lang.OutOfMemoryError and we could not connect via HTTP to tomcat any more. Although the main problem is the OutOfMemory Exception, tomcat should handle this situation more gracefully. Full error message is: 2004-09-23 23:59:59,135 ERROR [http8080-Processor265] [org.apache.tomcat.util.threads.ThreadPool] Caught exception (java .lang.OutOfMemoryError: unable to create new native thread) executing org.apache.tomcat.util.net.TcpWorkerThread@36b4a7, terminating thread It looks like the bug is in PoolTcpEndpoint, line 553 (in version 5.0.28) / line 547 (in version 5.0.19): in method "runIt", right after "acceptSocket"-call the TcpWorkerThreads asks the connection pool of the enclosing TcpEndpoint to execute itself (most importantly "acceptSocket") using "endpoint.tp.runIt(this);" But when tp.runIt fails because it got an OutOfMemory exception up thread creation, the calling threads "dies" (in ThreadPool.java, line 653 [version 5.0.19]) and there is no other thread calling "acceptSocket" any more. Again, although the main issue here is the OutOfMemory exception (which probably has to sth todo with # threads and lack of native memory etc.) tomcat should handle that kind of "DoS" gracefully. It should not stop accepting HTTP requests forever (or any kind of requests since "TcpEndpoint" indicates it is not limited to HTTP). In my opinion, the current TcpWorkerThread should catch the exception, close the current connection and return to "socketAccept"-mode.
You can patch your Tomcat to do that if you want to, but the main problem is that the VM is not in a stable state once an OOM error occurs (and besides, the VM will likely be spending all its time doing full GCs). Finding the source of the OOM would be better for you, IMO.
This has nothing to do with running out of heap. It is about running native memory : every thread uses some native memory (mainly for stack trace). Even if there is plenty of heap, native memory area might be to small to create a new thread. So JVM should run stable if no other threads are created.
ups, I meant "it is about running out of native memory" and native memory is mainly used for thread stacks (not traces). As for the proposed fix: on the other hand, the ThreadPool's contract might be about not throwing an exception when failing to create a new thread but tolerating this and queuing the runIt request for an existing thread to execute.
No, I disagree with your proposed fix. The ability to work when your configuration is bad enough that you can't create a new thread is not something I want to even try to do. Make sure your configuration (of the heap, of the stack size, and of the thread pool properties) works along with your hardware and operating system.
*** Bug 32262 has been marked as a duplicate of this bug. ***
This is obviously a problem with at least not logging the problem! eBay gets OOM all the time in the application stack that are recoverable. This behavior is unacceptable for an enterprise server.
We had the same issue (jdk1.5, tomcat 5.0.25) and it was solved by giving the vm more PermGenSpace.