Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
2.4, 2.4.1, 2.9, 2.9.1, 3.0
-
None
-
None
-
New, Patch Available
Description
When using Tomcat 6 and Solr 1.3 (with Lucene 2.4) we found that if we caused Tomcat to reload our .war files a number of times, we would eventually see PermGen memory errors where the JVM' s GC reported that all "permanent generation" memory had been consumed and none could be freed. This turns out to be a fairly common issue when using Tomcat's autoDeploy feature (or similar features of other application servers). See, for example:
http://ampedandwired.dreamhosters.com/2008/05/09/causes-of-java-permgen-memory-leaks/
http://cornelcreanga.com/2009/02/how-to-prevent-memory-leaks-when-reloading-web-applications/
http://www.samaxes.com/2007/10/classloader-leaks-and-permgen-space/
http://blogs.sun.com/fkieviet/entry/how_to_fix_the_dreaded
My understanding of the issue is that when reloading a webapp, Tomcat starts by releasing all of its references to the ClassLoader used to load the previous version of the application. Then it creates a new ClassLoader which reloads the application. The old ClassLoader and old version of the app are left to the garbage collector to be cleaned up. However, if the app itself hold references to the ClassLoader, the GC may not be able to ascertain that the ClassLoader is truly unused, in which case, it and the entire old version of app remain in memory. If one causes a sufficient number of app reloads, eventually PermGen space is exhausted.
The particular issue we had with Solr and Lucene was that Lucene's TimeLimitedCollector creates a thread which is not shut down anywhere; this in turn seems to prevent Tomcat from unloading the ClassLoader. To solve this I applied a minor patch to TimeLimitedCollector which adds a flag variable controlling the timer thread's loop and some methods to set it so the thread will exit.
The stopThread() method can then be called by an application such as Solr from a class registered as a servlet context listener; when the server is unloading the application the listener will execute and in turn stop the timer thread. My testing during multiple reloads of solr.war with and without these patches indicates that without them, we consistently get PermGen errors, and with them, once the PermGen is nearly exhausted (which may take a lot of reloads, e.g., 10-15!), the GC is able to free space and no PermGen errors occur.
Attachments
Attachments
Issue Links
- is duplicated by
-
LUCENE-2822 TimeLimitingCollector starts thread in static {} with no way to stop them
- Closed
- relates to
-
LUCENE-997 Add search timeout support to Lucene
- Resolved
-
SOLR-1735 shut down TimeLimitedCollection timer thread on application unload
- Closed