|
[
Permlink
| « Hide
]
Mark Miller added a comment - 15/Nov/08 09:33 PM
Test exposing the problem and one possible fix for it.
So this is a bit of an ugly way to do this. Whats the best way? Some kind of clone support on Directory?
Hmm good catch Mark! With
I'll make those changes & commit. New patch attached. If this looks OK to you Mark, I'll commit. Thanks!
Looks great to me. The only nitpick thing I'd change (its a left over from my first patch) is dropping mabyeCopy in SegmentReader into the else clause. Don't think its necessary, but I want to OCD it down.
Hmm actually it's not so simple, because we also need the directory to be cloned if you reopen the segment or get a whole new SegmentReader. Also, the case that
Ahh, I see. The newReader variable name should have tipped me off <g> I had it in my mind that there could be more but obviously didn't retain the idea (that test actually only breaks on one of the two originals).
Goes deeper than I would have guessed (not that I know reopen at all). Still have a test failing...
Attached another iteration...
I moved the logic into DirectoryIndexReader's reopen method, and fixed the case where an exception his hit during reopen (eg if a writer is committing, that's expected) to not incorrectly close the directory. I also strengthened the test cases to verify we are not over-incRef'ing the FSDir. Hmm...I've still got a test that fails. I'll post it tonight or in the morning after a triple check.
...trying to check my shoot from the hip modus operandi for a moment <g>... edit Ah, was optimizing without changing so getting back the same indexreader. Bad test. Patch holding up good so far... Okay, I've banged on it and I can't break it. Thats a brilliant way to do it by the way - not nearly as ugly (and is much shorter and actually works, so thats nice).
Patch looks good... I guess this was my bug, so thanks for finding and fixing it Mark and Mike!
OK thanks for the review guys! I'll commit soon.
Committed revision 718540. Thanks Mark!
I think this bug should be called out in changes. Disagree? I've seen 2 to 3 people that have seen it, and it seems we should document that its fixed in this version. I'll add it before the next release if nobody disagrees.
I agree – I just committed an entry to CHANGES.txt.
Reopening for backport to 2.4.1.
Committed revision 745797 on 2.4 branch.
Is this patch obsolete, if
It was too late in the night yesterday: I think this is not really related to
As this issue is only interesting for IndexReaders opened without "Directory" but witch FS path, it was a problem with incomplete removal of the reference counting of FSDirs in an early test. If the problem occurs again, I will reopen this issue. After some testing, I found out, that this issue is not fixed. The test "TestIndexReaderReopen" fails very often, if all occurences of FSDirectory.getDirectory() are replaced by FSDirectory.open() in IndexReader/IndexWriter and DirectoryReader.
open() returns non refcounting single-use directorys that can be closed only one time. If readers on top of this (using the now-deprecated File/String IndexReader.open()) are reopened the directories are sometimes closed false. I was hoping, There are two possibilities to fix this:
Attached is the patch, that replaces all occurences of FSDir.openDirectory() by open(). The test "TestIndexReaderReopen" fails almost every time with an AlreadyClosedException.
Vote for "leave them open". Yes, it breaches the contract, but the breach is controlled (and thus harmless) and we get rid of some weird code (=possible point of failure) without introducing new. IndexReader r = IndexReader.open("/path/to/index"); ..... Directory d = r.directory(); // you have to get directory reference as you're not the one who created it ..... r.close(); ..... d.doSomething(); // and EXPECT this call to fail with exception Since we've deprecated all methods that are using FSDirectory.getDirectory under-the-hood, why do we even need to fix this? Ie why replace all these with the new FSDir.open, now, when we're just going to remove them in 3.0 anyway?
Because the error sometimes also occurs with the refcounting directories, but more seldom (because of the refcounting helps to keep the directory open, even when it is closed one time too much).
And our problem: we want to really remove this ugly closeDir stuff from IndexReaders, the code is sometimes unreadable and its hard to find out whats going on. But the refcounting is also deprecated? And, IndexReader will no longer track closeDir in 3.0, since that's only set to true in the deprecated methods?
Oh, you mean there is an intermittent failure on the current trunk? (Ie, when using FSDir.getDirectory under the hood). This is the solution using the FilterIndexReader, all tests now pass (with refcounting deprectated dirs as well as FSDir.open-dirs, see next Patch).
The solution consists of two parts:
I would like to have this in 2.9, to get rid of these ugly closeDirectory hacks! All tests pass (I retried TestIndexReaderReopen about hundred times and no variant fails anymore). It also works, when replacing the refcounting FSDir.getDirectory by FSDir.open() calls (see next patch). This is a variant for testing the same with FSDir.open(). As you see, the reopening now also works correctly here and the underlying directory is not closed too often.
This patch is for demonstration only. I forgot to mention:
Yes, for example, if you clone() and index that owns a directory. In this case the directory was not cloned using the ugly hack currently in trunk (see patches before, which is also removed again by my patch) and was then not correctly refcounted. During reopen(), sometimes DirectoryReader uses clone() to create a new instance, when the readonly flag changed, but the index was not modified. The testcase leads sometimes to exactly this behaviour and fails then. My patch now also handles clone correctly, so it fixes all these problems by just factoring out the deprecated Directory management to a FilterIndexReader. DirectoryReader just sees the directory and will never try to close it. Patch looks fine. I read the last one,
Both patches are the same, the second one is just using FSDir.open() instead of FSDir.getDirectory. For committing, I would use getDirectory(), the first patch. FSDir.open() would change behaviour for uses relying on the System property to specify the directory impl.
OK patch looks good Uwe! This is much cleaner, and it's great to be able to get the scary closeDirectory logic entirely out of DirectoryReader.
Only two discussion points (the first one came up during an IRC chat with Earwin):
I think it should (be closed in a finally clause).
That sounds good.
Then there's the next question of the same sort, but probably belonging in a separate issue. If we close a DR and one of SR throws an exception - should we close the others (currently we don't)? What is the right way, in general, of handling IOExceptions on IR close? Can we retry the close? What does this exception mean? Good question
In some places we try hard to keep closing stuff, and then throw the In other places (SR, DR) we don't. In general you can retry a close – calling close more than once on For writers (IndexWriter, and DR with pending changes) we try hard not Here is an updated patch:
It is ready to commit. Mike will you do this, or should I assign myself to this issue? Do we need to backport to 2.4.2? It's not so easy, because DirectoryReader does not yet exist there (so you cannot simply apply the patch)...
Without backporting, clone() would not work correctly with shared FSDir IndexReaders. And: Backporting would hurt performance, because 2.4 does not use per-segment search, so each call from the searcher is passed through the FilterIndexReader... Two suggestions:
Factor out RefCount class and use it everywhere through Lucene. I see at least one identical to yours in SegmentReader. Would be easier to replace all these uses with AtomicInteger later. Looking at the new unsightly loop in doClose(), what if we change all Lucene closeable classes to implement java.io.Closeable and create a static utility method(-s) that receives a bunch of Closeables (an array, iterable, vararg in 1.5) and tries to close them all? I can do the latter one in a separate patch to close this issue faster. Hi Earwin,
attached is a patch, that simply reuses SegmentReader.Ref. Factoring it out to o.a.l.util would be harder to do at the moment (some test-cases rely on this class). And SegmentReader seems to be the only class, that uses such a Ref construct. Other classes have their refCounter as Field. As the Filter is just a deprecated wrapper, that is removed in 3.0, I think reusing SegmentReader.Ref for that is ok. This patch also contains a new test for clone(), that does the same like the reopen test (checking the refcounts). Closeable is a Java 1.5 interface only, so this refactoring must wait until 3.0, but the idea is good!
Go ahead & assign to yourself!
Ok. Maybe you are right.
We can introduce our own Closeable, and replace it with java native in 3.0, thank gods the interface is simple Mike: OK, I commit the latest patch soon!
Earwin:
I think you should open an issue about that. But the own closeable should be declared as deprecated from the beginning with the note "will be replaced by java.io.Closeable" in 3.0. Committed revision 783280.
2.4 branch is untouched, if backporting is needed (because somebody has problems with reopen/clone), reopen this issue! I'm seeing a failure in back-compat tests ("and test-tag -Dtestcase=TestIndexReader"):
[junit] Testcase: testFalseDirectoryAlreadyClosed(org.apache.lucene.index.TestIndexReader): FAILED
[junit] did not hit expected exception
[junit] junit.framework.AssertionFailedError: did not hit expected exception
[junit] at org.apache.lucene.index.TestIndexReader.testFalseDirectoryAlreadyClosed(TestIndexReader.java:1514)
(I assume, but I'm not certain, it's from this fix...). Thanks Mike,
it is from this fix. The test should normally also fail with trunk, but it doesn't because we are using FSDir.open() in the trunk test. This is another test that relies on the refcounting of FSDir.getDirectory. The problem: This fixes this special case and the test on trunk to also hit this, I commit soon!
Committed revision 783314. Thanks Mike! Next time I will also test-tag, sorry.
I will go through all other tests using FSDir.open() in trunk now and check, if there are more cases, that rely on refcounting. They can be found easily, because they catch AlreadyClosedException. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||