Created attachment 29775 [details] Stacktrace This only happends sometimes, and is hard to reproduce. After undeploying and deploying a webapp, sometimes the webapp show an exception at the jps's. I've traced it down to ManagerBase.java - generateSessionId() method. It looks like the sessionIdGenerator is null. Attached stacktrace
Hmm. An NPE there means that a session manager is being used that either has not been started or has been stopped. Given it is hard to reproduce we will probably have to resort to code analysis to determine what is goin on. I have a few ideas but before I start looking harder it would be useful to have a few more details. 1. Have you ever seen this issue after a normal start or is it only on a restart? 2. When the NPE appears, does it happen on every request from then on or does it happen to just a few requests during the restart? 3. If the answer to 2) is every request does a context restart fix it?
Hi Mark (and others), 1. I have never seen this after startup of tomcat, only when undeploying/deploying a webapp. 2. Yes, it happends on all requests. 3. No, a context restart does not fix it, only a restart of the tomcat fixes it.
I'm still not exactly sure what is going on here but it looks like something is going wrong with the undeploy. There have been some fixes related to failed undeploys since 7.0.29 so it would be helpful if you could test with 7.0.34. In the meantime, I'll keep digging.
I've found and fixed one place where undeploy could go wrong (filters throwing exceptions on destroy) and I have fixed that in trunk and 7.0.x for 7.0.35. I'm not sure that is what is going on here though although it is probably something very like that. I'd expect to see something in the logs during the undeploy to indicate what has gone wrong.
We need to track what is going on. Please add the following to you logging.properties and then provide the section of the logs that shows the undeploy/deploy when the error occurs. org.apache.catalina.util.LifecycleBase.level = FINE Note: This will slow down Tomcat start and stop but it should have minimal impact on normal running.
It has been three weeks without any further information provided. If the information has not been provided by the time of the next 7.0.x release then this issue will be closed as WONTFIX.
(In reply to Mark Thomas from comment #6) > It has been three weeks without any further information provided. If the > information has not been provided by the time of the next 7.0.x release then > this issue will be closed as WONTFIX. Hi, I have the same error and I get it quite often. I use Tomcat 7.0.50 on a 64 bit machine. I have this log4j properties config, if you tell me how to change it to log the error I can provide the logs. # configure the daily rolling file appender log4j.rootLogger=FINE, RollingAppender # Set the priority for each package log4j.logger.org.apache.catalina.util.LifecycleBase=FINE, RollingAppender log4j.appender.RollingAppender=org.apache.log4j.DailyRollingFileAppender log4j.appender.RollingAppender.File=../logs/ApplicationWeb.log log4j.appender.RollingAppender.DatePattern='.'yyyy-MM-dd log4j.appender.RollingAppender.layout=org.apache.log4j.PatternLayout log4j.appender.RollingAppender.layout.ConversionPattern=[%p] %d %c %M - %m%n The staccktrace of the error I get is: java.lang.NullPointerException org.apache.catalina.session.ManagerBase.generateSessionId(ManagerBase.java:807) org.apache.catalina.session.ManagerBase.createSession(ManagerBase.java:653) org.apache.catalina.connector.Request.doGetSession(Request.java:2956) org.apache.catalina.connector.Request.getSession(Request.java:2320) org.apache.catalina.connector.RequestFacade.getSession(RequestFacade.java:899) org.apache.catalina.connector.RequestFacade.getSession(RequestFacade.java:911) com.abovo.gc.plata.filter.AuthenticationFilter.doFilter(AuthenticationFilter.java:47)
It is extremely rude to add someone to a cc list for a bug without their permission. Repeat offenders will be blocked from further use of the ASF Bugzilla system.
This is either an application bug or a bug in com.abovo.gc.plata.filter.AuthenticationFilter that is retaining and using a reference to a request object from a previous request rather than using the request object for the current request. This is not a bug in Apache Tomcat. Further assistance is available from the Apache Tomcat users mailing list.
Sorry, I tought that you could be interested in the opportunity to have some debug about the presumed error. I will keep investigate it.
Lorenzo's thread on users@ for Tomcat 7.0.50: http://marc.info/?t=139351186000004&r=1&w=2 Just some guesses 1. This should not happen, because StandardWrapperValve.invoke() takes care of it. If web application is being stopped while some requests are still being processed, Tomcat logs the following message is logged like the following: [[[ 05-Mar-2014 19:59:33.701 INFO [localhost-startStop-1] org.apache.catalina.core.StandardWrapper.unload Waiting for 1 instance(s) to be deallocated for Servlet [jsp] 05-Mar-2014 19:59:34.701 INFO [localhost-startStop-1] org.apache.catalina.core.StandardWrapper.unload Waiting for 1 instance(s) to be deallocated for Servlet [jsp] ]]] If such errors are logged, it is explainable that other errors may follow. Note though that the messages are logged at INFO level. Sometimes logs are configured to skip those. Note that 1) StandardContext.stopInternal() stops its wrappers (=children) before all other subcomponents 2) The issue is reported as occurring in a filter. Invoking the filter chain is performed in StandardWrapperValve.invoke() in between wrapper.allocate() and wrapper.deallocate() calls. As such, it should be protected by the same allocation counter that protects the servlet that is being called here. 2. In StandardWrapper.allocate() I think the following code: [[[ if (unloading) throw new ServletException (sm.getString("standardWrapper.unloading", getName())); ]]] would better be copied inside synchronized (this) {} block below. The 'unloading' field may change while we are waiting to get into synchronized block. That is if there is the following sequence of events: TH1: in StandardWrapper.allocate() checks for unloading TH1: in StandardWrapper.allocate() waits for synchronized block TH2: executes StandardWrapper.unload() TH1: in StandardWrapper.allocate() obtains the monitor and allocates a servlet Expected: TH1 should not be able to allocate one, as unloading has already happened. 3. I think the o.a.c.core.StandardContext.paused field would better be marked volatile. The redeployment happens in background thread, but the field is read in CoyoteAdapter.postParseRequest() in a request processing thread. It might read a stale value. (It would not cause this NullPointerException as reported here. It can cause "503 Unavailable" response from StandardWrapperValve.invoke() or ServletException("standardWrapper.unloading") from StandardWrapper.allocate()). 4. In StandardWrapper.invoke() the check for "if (!context.getState().isAvailable())" is done before "wrapper.allocate()" call. There is small time window between those checks. (It would not cause this NullPointerException as reported here. It can cause ServletException("standardWrapper.unloading") from StandardWrapper.allocate()). Anyway, in this case I do not see how it could be done better. The LifecycleBase.state field is already marked as 'volatile'. Asking for it twice seems like a waste. The checks in allocate() should protect here/ ======== Resume: for 1.: Beware if logging is configured to skip INFO messages. Of course, if context is stopped before request processing on it completes, some things are expected to fail. Tomcat waits here, the timeout is configurable. It is a tradeoff. An infinite wait is usually undesirable. See "unloadDelay" setting on Context. The default value is 2 seconds. http://tomcat.apache.org/tomcat-7.0-doc/config/context.html for 2. and 3.: These can be improved. for 4.: OK, nothing to do.
re 2, I don't see how this could lead to the constant NPE issue the OP reported unless deployment failed. Also re 2, unloading would benefit from being volatile. Adding a check for unloading in loadServlet() should address the issue of allocation occurring after unloading has started. 1, 3 & 4. Agreed. I have applied the robustness improvements to 8.0.x for 8.0.4 and 7.0.x for 7.0.53. I still haven't see any evidence that there is a Tomcat bug here such as an explanation of how any of the reported problems can occur that doesn't involve an application and/or library retaining a reference it shouldn't. Therefore, I am returning this bug to the invalid state.
Created attachment 31375 [details] the logs when I get the null pointer
Created attachment 31376 [details] the logs when the server is redeployed successfully
I added some logs with at the INFO level. I hope you can gather some information from it.
(In reply to Mark Thomas from comment #13) > re 2, I don't see how this could lead to the constant NPE issue the OP > reported unless deployment failed. > I said: > TH1: in StandardWrapper.allocate() checks for unloading > TH1: in StandardWrapper.allocate() waits for synchronized block > TH2: executes StandardWrapper.unload() > TH1: in StandardWrapper.allocate() obtains the monitor and allocates a servlet > Expected: TH1 should not be able to allocate one, as unloading has > already happened. I mean that it continues this way: TH2 continues shutting down the context, as it thinks that all wrappers are unloaded and no request processing is happening at this moment. In StandardContext.stopInternal() TH2 has done "children[i].stop();" and the next steps that it does are: * filterStop(); * ((Lifecycle) manager).stop(); In ManagerBase.stopInternal() it does "this.sessionIdGenerator = null;". TH1: It goes on with request processing and encounters that the session manager has been stopped and manager.sessionIdGenerator is null. This is just a theory, I have not tested it.
(In reply to Konstantin Kolinko from comment #17) > (In reply to Mark Thomas from comment #13) > > re 2, I don't see how this could lead to the constant NPE issue the OP > > reported unless deployment failed. > I mean that it continues this way: > > TH2 continues shutting down the context, as it thinks that all wrappers are > unloaded and no request processing is happening at this moment. > > In StandardContext.stopInternal() TH2 has done "children[i].stop();" and the > next steps that it does are: > > * filterStop(); > * ((Lifecycle) manager).stop(); > > In ManagerBase.stopInternal() it does "this.sessionIdGenerator = null;". > > > TH1: It goes on with request processing and encounters that the session > manager has been stopped and manager.sessionIdGenerator is null. > > This is just a theory, I have not tested it. That would explain a small number of NPEs. I don't see how it could result in all future requests resulting in NPEs as reported by the OP.