Created attachment 31435 [details] the log from the crash Tomcat 8 crashed when ~1000 users were browsing a website (JSP front-end, Struts-2 backend with Java Persistence API, Hibernate, c3p0, MySQL). No crashes are observed when using JAVA NIO (not tested with JAVA BIO). The attached file is from the crash.
Created attachment 31436 [details] another log - same issue
APR version is 1.5. OpenSSL 1.0.1f.
Created attachment 31437 [details] third one - same issue. Just to show that this is a recurring thing.
> Tomcat 8 crashed What exactly version of Tomcat 8.0.x ?
8.0.3 - currently the latest one, AFAIK.
Does high-load appear to be a required factor?
Apaprently - yes. I wasn't able to reproduce it with low number of concurrent users. But it's hard to say if it would never crash or it's just that it would crash -- only much later. I used JMeter to simulate high load. The whole server works flawlessly (including very low request-response latency) until the very end when it crashes. Nothing exceptional happens in the system: no unusual disk activity, no other CPU- or RAM-hungry processes. It just crashes suddenly for no obvious reason.
With 4000 users, it crashes in less than a minute (Used JMeter to test).
Could you provide a JMeter test case? I'm sure that would be helpful. I'm not win32 dev, but I'm sure something that can reproduce the issue would be helpful to whoever looks at this.
Created attachment 31441 [details] JMeter test plan
I used a debugger to locate the crash address in the source code. It looks like the crash occurs inside the function "Java_org_apache_tomcat_jni_Poll_poll", specifically inside the following block: for (i = 0; i < num; i++) { tcn_socket_t *s = (tcn_socket_t *)fd->client_data; p->set[i*2+0] = (jlong)(fd->rtnevents); p->set[i*2+1] = P2J(s); if (remove) { apr_pollset_remove(p->pollset, fd); APR_RING_REMOVE(s->pe, link); APR_RING_INSERT_TAIL(&p->dead_ring, s->pe, tcn_pfde_t, link); s->pe = NULL; p->nelts--; #ifdef TCN_DO_STATISTICS p->sp_removed++; #endif } else { /* Update last active with the current time * after the poll call. */ s->last_active = now; } fd ++; } The crash address suggests that a read of the "s->pe" variable inside the "if(remove)" block fails (the variable is null).
What I meant is that the s->pe is NULL, so a read from that address crashes the application.
A quick workaround would be to check for NULL before doing the read. However, if the library is designed so that there shouldn't be any NULLs at this point in the code, such patch would only mask the underlying bug (a race condition maybe?).
Same issue here, tried using the latest tomcat 7/8 but getting the same error and tomcat crashes. Any updates on this? (currently the only solution is to switch to NIO)
*** This bug has been marked as a duplicate of bug 57653 ***
*** Bug 55797 has been marked as a duplicate of this bug. ***
*** Bug 56415 has been marked as a duplicate of this bug. ***
*** Bug 57140 has been marked as a duplicate of this bug. ***