Bug 36594 - Hang (infinite loop) in ResourceCache under high load
Summary: Hang (infinite loop) in ResourceCache under high load
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 5
Classification: Unclassified
Component: Catalina (show other bugs)
Version: 5.5.9
Hardware: All All
: P2 major (vote)
Target Milestone: ---
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-09-10 21:34 UTC by Anil Gangolli
Modified: 2005-09-12 10:26 UTC (History)
0 users



Attachments
Patched version of ResourceCache.java (full text) (11.45 KB, patch)
2005-09-10 21:37 UTC, Anil Gangolli
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Anil Gangolli 2005-09-10 21:34:48 UTC
Under sustained high loads (100-200 requests per sec) over 12+ hour periods we
were regularly hitting a situation where all of the serving threads would be
blocked except for one thread looping infinitely in
org.apache.naming.resources.ResourceCache#allocate().

After quite a bit of effort, we tracked this down to a number of minor bugs in
the ResourceCache code, two having to do with the cacheSize accounting, and one
to do with an off-by-one error in the range of the random generation of an index
to consider for freeing.

We've created a local patch, and we've verified that this fixes the issue for
us, actually monitoring the correctness of the cacheSize accounting.  I will be
attaching a patched source for ResourceCache.java.  Please consider
incorporating these fixes into subsequent revisions.  The bug appears both in
Tomcat 5.0.30 and the latest trunk code.

Anil Gangolli (anil@busybuddha.org / gangolli@apache.org)
Comment 1 Anil Gangolli 2005-09-10 21:37:28 UTC
Created attachment 16353 [details]
Patched version of ResourceCache.java (full text)

This file contains the full source including the patches.  Each patch has a
comment describing it.	It should be evident using diff.  If you need anything
else, or further explanation, please let me know.
Comment 2 Remy Maucherat 2005-09-11 18:49:33 UTC
This seems like a good patch, but you should submit diffs rather than full files.
Comment 3 Anil Gangolli 2005-09-11 21:46:47 UTC
Thanks.  I'll submit diffs as well tomorrow.  I'm not near the code presently.
Comment 4 Remy Maucherat 2005-09-11 21:53:44 UTC
(In reply to comment #3)
> Thanks.  I'll submit diffs as well tomorrow.  I'm not near the code presently.

That should be ok, it's just that it's much easier to work with.
Comment 5 Remy Maucherat 2005-09-12 12:56:51 UTC
I applied the patch. Thanks a lot since it would have been impossible to debug
without knowing the exact usage.

The case where insertCache "fails" because the entry is already present is a
race condition on allocate and insert, so it would be best if another lookup
under sync should be performed before trying (to avoid the uneeded allocate):

Index: ProxyDirContext.java
===================================================================
RCS file:
/home/cvs/jakarta-tomcat-catalina/catalina/src/share/org/apache/naming/resources/ProxyDirContext.java,v
retrieving revision 1.18
diff -u -r1.18 ProxyDirContext.java
--- ProxyDirContext.java	20 Jul 2005 21:25:18 -0000	1.18
+++ ProxyDirContext.java	12 Sep 2005 10:43:35 -0000
@@ -1596,7 +1596,7 @@
         // Add new entry to cache
         synchronized (cache) {
             // Check cache size, and remove elements if too big
-            if (cache.allocate(entry.size)) {
+            if ((cache.lookup(name) == null) && cache.allocate(entry.size)) {
                 cache.load(entry);
             }
         }
Comment 6 Anil Gangolli 2005-09-12 18:26:36 UTC
Thanks for the rapid response.  Will this fix be present in any future Tomcat 
5.0.x releases as well?  Any chance of getting this?