|
I can confirm this patch fixed the memory problem, in general. Thanks!
However, if not nitpicking, I did noticed in this patch, during the transition time between the previous index searcher/reader close and the new index searcher/reader open, there is a bump in the memory graph. In previous Lucene version without Lucene-1195, the memory graph looks more clean, with a V dip. And in one index refresh during the same index run, there were 2 RAMDirectory in the memory for a fairly long time. But not repeatable. So I suspect the closing is kind of delayed. And sometimes it could be missed. Attaching the screenshot for the memory bump, and 2 the memory graphs of the begin and after of 2RAMDirecory in the memory(needed 2 because the time is fairly long) The test is repeatedly close and open the index.
After the repeat close/open test run finished, I closed the RAMDirectory based searcher/reader, and switched to FSDirectory, the RAMDirectory(which is fairly large) was left in the memory. I did several rounds of GC, but the RAMDirectory was not cleaned. At this time, I tried to do a memory dump, but an OOM occured. Hmm, the sometimes long-lasting bump in Used memory is odd.
When you close the old IndexSearcher & RAMDirectory do you forcefully dereference them (set all variables that point to them to null) before then reopening the new ones? (Seems like you must, since you saw the V shape before I think the bump may just be GC taking its time collecting the objects, which should be harmless. Maybe using a WeakReference to a long-lived object causes GC to take longer to collect? Is it possible to use the profiler to look for a reference to the old RAMDirectory while the bump is "up"? That would be a smoking gun that we do still have a reference, unless it's only via the WeakReference. Or, could you try lowering the heap size in your JRE to something slightly less than 2X one RAMDirectory (but not so low that the "other stuff" you have in the JRE can't fit). This would force GC to work harder / more immediately in reclaiming the unreachable objects. On second thought, I think this is normal behavior. Like you said, the GC may take a while before really cleaning up stuff. So if I did not wait a while between the close and open, the little bump should be normal.
So I would say this bug is fixed. Thanks! OK super! Thanks for testing Chris. I'll commit shortly.
Committed revision 695184.
Even if the issue is closed, for those who want to know why ThreadLocal had to be fixed : http://www.javaspecialists.eu/archive/Issue164.html
"Best Practices If you must use ThreadLocal, make sure that you remove the value as soon as you are done with it and preferably before you return your thread to a thread pool. Best practice is to use remove() rather than set(null), since that would cause the WeakReference to be removed immediately, together with the value. Kind regards Heinz" It doesn't need to be "fixed".
It works fine, you need to understand how to use it properly which may require calling remove() if your threads are long-lived. Also, remove() was only added in 1.5, and that is why that technique is not valid for Lucene. It is also not valid, since you need to clear all values across all threads, remove() only clears the entry for the calling thread. The current patch solves the problem suitably for Lucene, at the expense of some performance degradation.
The performance degradation should be negligible in the case of long-lived threads. No synchronized code was added in the get() path.
This best practice does work correctly: you're supposed to call remove() from the very thread that had inserted something into the ThreadLocal. Besides the 1.5 issue, this is also difficult for Lucene because we don't have a clean point to "know" when the thread has finished interacting with Lucene. A thread comes out of the pool, runs a search, gathers docIDs from in a collector, returns from the search, one by one looks up stored docs / term vectors for these docIDs, maybe does secondary search up front to build up filters, etc., finishes rendering the result and returns to the pool. Unless we create a new method "detachThread(...)" somewhere in Lucene, there is no natural point now to do the remove(). And I don't like creating such a method because nobody will ever know they need to call it in their App server until they start hitting cryptic OOMs. Ok, maybe "fixed" was too much and it sounded like ThreadLocal has a problem. ThreadLocal works fine except that it is tricky with long-lived threads.
Anyway, it is a good article in case somebody wants to understand why this change in Lucene. Michael,
Maybe some sort of cleanup method should be created even if regular users will not call it. I presume it can be highlighted somewhere in the documentation, like in "Best practices" : http://wiki.apache.org/lucene-java/BestPractices I would really like to be able to control this, to be able to clean thread(s) storage programmatically. You cannot control this 'externally", since there is no way to force the cleaning of the stale entries (no API method).
The current patch sort of allows this - the entries are not cleared but the weak references in them are - via the close(), but if this method is called at the wrong time, everything will break (since the stream needs to be cached), thus it is not exposed to the outside world. If you want to cleanup programatically, close the index readers and writers. If closing all index readers and writers releases all Lucene thread locals it's great.
Thanks. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The patch adds o.a.l.util.CloseableThreadLocal. It's a wrapper around ThreadLocal that wraps the values inside a WeakReference, but then also holds a strong reference to the value (to ensure GC doesn't reclaim it) until you call the close method. On calling close, GC is then free to reclaim all values you had stored, regardless of how long it takes ThreadLocal's implementation to actually release its references.
There are a couple places in Lucene where I left the current usage of ThreadLocal.
First, Analyzer.java uses ThreadLocal to hold reusable token streams. There is no "close" called for Analyzer, so unless we are willing to add a finalizer to call CloseableThreadLocal.close() I think we can leave it.
Second, some of the contrib/benchmark tasks use ThreadLocal to store per-thread DateFormat which should use tiny memory.