|
Michael McCandless made changes - 19/Jan/07 03:09 PM
(This is a continuation of the discussion from one of the threads quoted in the previous comment, with some summations to provide context.)
Referring to a proposal to implement advisory locking where possible, and prevent index creation on volumes/systems where the delete mechanism fails, Michael McCandless wrote: But what I don't like about it is it doesn't "gracefully degrade" to The stratagem of publicly exposing IndexFileDeleter does not "degrade gracefully", either. If today's default deletions policy remains the default, then when an index is put on an NFS system, at some point search-time exceptions will occur after an update to an index. The user has to 1) detect this scenario without Lucene's help, probably from logs or user complaints 2) diagnose it without the aid of a meaningful error message, and 3) either implement their own IndexFileDeleter or choose an appropriate provided subclass, a task which will require that they grok both the innards of Lucene and the subtleties of NFS. The first step to graceful degradation is a meaningful error message. That means detecting a problem which normally requires both a indexing process and a search process to trigger, so we have to simulate it artificially with a test. 1) Open a file for read/write and write a few bytes to it. Our first opportunity to perform this test occurs at index-creation time. This is essentially cost free. A second opportunity arises whenever deletions are performed. Here, there's a small cost involved, and it may not be worth it, as this would only catch cases where an index was copied onto an NFS volume rather than created there, then subsequently modified. There's also another stratagem available to us: we can have IndexReaders establish advisory read locks against their relevant segments_N files on systems which support such locks. This won't help us to "degrade gracefully". However, it will help us degrade less often, as modern versions of NFS support file locking - but still do NOT support the "delete-on-last-close" mechanism Lucene depends on. Implementing advisory read locks is orthogonal to the addition of IndexFileDeleter, but diminishes the justification for adding it as a public API. But the good news is since we will allow subclassing to make your own The number of users this class will serve is diminishingly small. Other mitigation strategies are available. 1) If we implement advisory read locks, many people who see this error will no longer see it. For those who do, the best option is to upgrade the OS to a version which supports advisory locks over NFS. Then an index on an NFS volume will behave as any other. Flexibility is not free. There have been recent lamentations on java-dev about how difficult it will be to merge the write interfaces of IndexReader and IndexWriter to provide a single, unified class through which all index modifications can be performed. The exposure of the IndexFileDeleter mechanism contributes to this problem – it's one more small step in the wrong direction. Providing a subclassing/callback API is often an elegant strategy, and it is surely better in this case than it would be to provide a list of deletion policies for the user to select from. However, whenever possible, no API is always a better solution – especially in a case like this one, where the functionality provided has nothing to do with Lucene's core mission and is there solely to work around an implmentation-specific bug. There are two sections in the previous comment that are supposed to be quoted, but which are not because JIRA ate the email-style "greater than" quoting. They are:
"But what I don't like about it is it doesn't "gracefully degrade" to "But the good news is since we will allow subclassing to make your own I wish it were possible to edit that note, as it's really confusing as things stand. > * Second, change how IndexFileDeleter works: have it keep track of
> which commits are still live and which one is pending (as the > SegmentInfos in IndexWriter, not yet written to disk). > > Allow IndexFileDeleter to be subclassed to implement different > "deletion policies". > > The base IndexFileDeleter class will use ref counts to figure out > which individual index files are still referenced by one or more > "segments_N" commits or by the uncommitted "in-memory" > SegmentInfos. Then the policy is invoked on commit (and also on > init) and can choose which commits (if any) to now remove. > > Add constructors to IndexWriter allowing you to pass in your own > deleter. The default policy would still be "delete all past > commits as soon as a new commit is written" (this is how deleting > happens today). > > For NFS we can then try different policies as discussed on those > threads above (there were at least 4 proposals). They all have > different tradeoffs. I would open separate issues for these > policies after this issue is resolved. > This ties solving the NFS issue with an extendable-file-deletion policy. Also, IndexFileDeleter is doing delicate work - not sure you want Back to reference counting,- how about the following approach:
So, this is adding to Directory a general file utility, no knowledge of An NFS geared FileReferenceCounter would then be able to keep alive OK, a few top level summary comments and then some specifics below:
By implementing each approach (I think there are now 5 different
Lucene should try hard to be as portable as possible. NFS is used alot. It's tempting to tell users to "upgrade OS", "upgrade NFS server Now, if we had to bend over backwards for NFS, then I would agree Rather than baking any one of these approaches into the Lucene
Marvin I don't think your test will work either (see below). But I really like this direction: does anyone know how (Java Some specifics below: Marvin Humphrey wrote: > The first step to graceful degradation is a meaningful error message. I think this test won't work (though I haven't tested...). Typically > But the good news is since we will allow subclassing to make your own 5) isn't really a good option since we all can't even agree how to > Flexibility is not free. There have been recent lamentations on Yes there is an open question now on what to do about the confusion on > Providing a subclassing/callback API is often an elegant strategy, I disagree on this point ("no" API is better than subclassing). As By allowing a different delete policy as a subclass of I think subclassing is perfect for this sort of situation. It's like Mike Doron Cohen wrote: > This ties solving the NFS issue with an extendable-file-deletion policy. The solution I have in mind abstracts away all tricky details of public class OnlyLastCommitDeleter extends IndexFileDeleter { void onInit(List commits) { onCommit(commits); } void onCommit(List commits) { Ie, the sole responsibility of the IndexFileDeleter subclass (policy) > Back to reference counting,- how about the following approach: I think this approach could work, but, rather than implementing in the We have so much debate about the best "deletion policy" for NFS that Mike One clarification on "different deletion policies": to support "commit
on close" in IndexWriter, I already have to improve IndexFileDeleter to give it a different deletion policy than the current one. Specifically, the deleter must not delete anything referenced by the For example, if a writer is opened with autoCommit=false ("commit on To the deleter this would just be a different policy, one that keeps On Jan 19, 2007, at 2:04 PM, Michael McCandless (JIRA) wrote: When I asserted that IndexFileDeleter had nothing to do with Lucene's core > Yes there is an open question now on what to do about the confusion on using They're hard to refactor because they're both big, hence adding either code or > I disagree on this point ("no" API is better than subclassing). We're talking past each other. I was generalizing: a 100% successful, purely > I would not want to add file locking & additional complexity into the Lucene This is where our differing priorities manifest. I would rather add some Ironically, though you consider supporting NFS "part of Lucene's core I also think you may be over-estimating the amount of effort it will take to > I think subclassing is perfect for this sort of situation. I'm not so enamored of subclassing. It's less constraining than some other Case in point: it's not possible to provide a subclass of IndexFileDeleter In theory, your proposal even prevents the addition of such read locks to > It's like the various LockFactory implementations we have: there is no "one I don't think LockFactory ought to be exposed either. Reading from an index – any index, on any volume – should Just Work. In my view, any deviance from that ideal API due to implementation defects In keeping with this philosophy, Lock is not publicly exposed in KinoSearch.
Once advisory file locks are in place, and if they work as expected under With the switch to lockless commits in KinoSearch, I've able to refactor Lock With Otis signing on to your solution, it looks like momentum is gathering for I think you can do better. Quick summary first:
OK, as you said (and I agree) I think we just have a difference of Whereas I prefer to leave the Lucene core untouched since things work I think you also have a high confidence that the locking approach will So I would prefer instead to open a minimal API in the core of Lucene OK details below: > > * I think NFS support is part of Lucene's core mission. Well, "custom deletion policies" is in support of the core mission of > > Yes there is an open question now on what to do about the confusion on using OK, yes in the ideal case, no API is better than API if your situation > > I would not want to add file locking & additional complexity into the Lucene Yes, but at least having the option (picking one of the deletion > I also think you may be over-estimating the amount of effort it will take to Yes detection of NFS is orthogonal and I would love to find a solution > > I think subclassing is perfect for this sort of situation. No, I would not consider foreclosing that option a feature! Yes, I am nervous about relying on advisory read locks 100% today in Yes users who set their own deletion policies would not get this And, I don't want to change the default policy now ("first do no > > It's like the various LockFactory implementations we have: there is no "one I agree it would be great to reach this perfect world. It would be > I think you can do better. With time, I hope so too. Progress not perfection! Michael McCandless wrote:
> The solution I have in mind abstracts away all tricky details of I don't really understand this interface and so I cannot see how (I would prefer the DeletionPolicy to be a I found a flaw in my plan. If the read locks are always applied, they will
slow deletion of obsolete segments for everybody where delete-on-last-close currently works as intended. Right now, the files are unlinked and disappear as soon as the last reader holding them open goes away. With read locks, the unlink op wouldn't take place if a reader was open. I also spent a good bit of time yesterday further researching the subtleties So, new proposal: Add a new public method IndexReader.aquireReadLock(String hostId), which would The deletions policy would work as in my earlier proposal, and the user We might want aquireReadLock() to automatically zap any locks associated with > I don't really understand this interface and so I cannot see how you
> intend to rewrite the IndexFileDeleter as you describe, but I agree > that if this can be done it is a better solution. So I am okay with > waiting for this approach to mature into code. The deletion policy is called on creation of a writer (onInit) and For example, onCommit you would typically see a List of length 2: the Realize that the "commit on close" mode (autoCommit=false) for > (I would prefer the DeletionPolicy to be a pluggable interface and Good point: I agree an interface here is cleaner. I will use an > I found a flaw in my plan. If the read locks are always applied, Ahh good point. This is why I don't want to risk changes to Lucene > I also spent a good bit of time yesterday further researching the Well, dot-locks (= "create file exclusively") have problems too. http://java.sun.com/j2se/1.4.2/docs/api/java/io/File.html#createNewFile( This warning is why we created the NativeFSLockFactory for Directory > So, new proposal: OK. You could implement this in Lucene as a custom deletion policy On Jan 23, 2007, at 2:19 PM, Michael McCandless (JIRA) wrote:
> First do no harm. If that was really your guiding philosophy, you would never change anything. > And Sun's Javadocs on the equivalent Java method, File.createNewFile, has a That page recommends that you use FileLock instead, which maps to Fcntl on This interface follows the completely stupid semantics of System V and Trying to guarantee that kind of discipline from library code severely limits > This warning is why we created the NativeFSLockFactory for Directory locking Take a look at this bug, which explains how that warning got added. http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4676183 Read the comment below – the problem with the "protocol" they warn you > OK. You could implement this in Lucene as a custom deletion policy once we This was the response I got on the KinoSearch list: We do not enable NFS writes, only reads (which is why Slashdot is able to Lack of bulletproof support for NFS ain't gonna hold up my next release any > The deletion policy is called on creation of a writer (onInit) and
> once per commit (onCommit) and is given a List of existing commits (= > SegmentInfos instances) in the index. The policy then decides which > commits should be removed and IndexFileDeleter translates that request > (using reference counting, because a given index file may still be > reference by commits that are not yet deleted) into which specific > files to remove. > ... Michael thanks for explaining this further - yes, now it makes sense to me. OK, I've attached a patch to implement "commit on close" and "custom
deletion policies". The design is exactly what's described above. There are no changes to the file format. All tests pass and I've added additional tests for this new Summary of the external changes:
Summary of internal changes:
This is a nice simplification of IndexFileDeleter: previously it
This is also a nice simplification for the same reason as above:
Michael McCandless made changes - 02/Mar/07 03:52 PM
Rebased the patch to the current trunk. I plan to commit this probably end of this week.
Michael McCandless made changes - 07/Mar/07 09:56 AM
Mike, patching take2 on current trunk fails for IndexFileDeleter.java.
patching file src/java/org/apache/lucene/index/IndexFileDeleter.java Hunk #1 FAILED at 18. Also some noise in SegmentInfo.java patching file src/java/org/apache/lucene/index/SegmentInfo.java Hunk #7 succeeded at 291 (offset 3 lines). Woops, looks like the commit for
It's too bad we don't have a patch that's better integrated with svn such that if even you have a more recent svn revision checked out, applying the patch would do so back against the revision it was based on, and then svn would merge the changes committed to the trunk since then. In this case an svn update on the checkout with the diffs produced no conflicts, so if we had such a combined patch tool, it would have worked find here. I suppose the person applying the patch could first "svn update" to its base revision, apply the patch, then svn up, but that's kind of a hassle
Michael McCandless made changes - 09/Mar/07 09:44 AM
Michael McCandless made changes - 13/Mar/07 09:11 AM
I was too slow in reviewing this, so while I was studying the new code it was committed...
Anyhow I have a few comments and a question - I think JIRA The attached 710.comments.diff implements a few suggested changes. I like the definition and use of IndexDeletePolicy and CommitPoint - this is very flexible and clear, and would indeed allow to implement NFS suited logic. These two concepts are central to implementing such logic, and I thought their Javadocs should be enhanced (included in the attached). IndexFileDeleter - it is nice that this became non public and somewhat simpler. I added some internal documentation (not javadocs) in that file as I learned how it works. I think these would be useful for others diving into this code. I also modified some variable names for clarity (in the attached). I don't understand yet why we allow a deletion policy to delete all commits (including the most recent) - TestDeletionPolicy explains this as: "This is useful for adding to a big index w/ autoCommit =false when you know readers are not using it." - so, would I risk losing the big index should uncommited additions fail? what does one earn by this? I first thought we should prevent (exception) deleting the most recent commit, but I must be missing something - could you elaborate on this? checkpoints() is another - more internal - new concept in this code. At writing this I don't fully understand it. IndexWriter has its own checkpoint() method, but it also calls IndexFileDeleter.checkpoint(). IndexReader only calls IndexFileDeletion.checkpoint() - it does not have a checkpoint() itself. ...mmm... For IndexReader it makes sense since it always commits only at close(), or at explicit calls to commit(). Perhaps I understand it better now... Ok, I added some documentation for this in IndexWriter, I think it would also help others. (in the attached.) This issue also introduced constants for file names - hasSingleNorms (i.e. nrm) and SINGLE_NORMS_EXTENSION (.fN) were confusing/collating - so I modified .fN to PLAIN_NORMS_EXTENSION. This issue moved some files logic SegmentInfo. The -1/1/0 logic and especially with norms is confusing, and at least I have to re-read the code carefully each time again and again to be convinced that it is correct. It would be nice when we can get rid of some of the backward compatibility cases here. Anyhow I added some documentation and also replaced the -1/1/0 with constants, I think this makes it easier to understand. Regards,
Doron Cohen made changes - 15/Mar/07 10:08 AM
Thanks for the review Doron!
Your added comments & improvements to the variable names are > IndexFileDeleter - it is nice that this became non public and somewhat I especially like that this class now has very little knowledge of > I don't understand yet why we allow a deletion policy to delete The use case I was thinking of is: say you already have a large index > This issue moved some files logic SegmentInfo. The -1/1/0 logic and Yes the backwards compatibility code (for pre-2.1 indices) is complex.
Michael McCandless made changes - 15/Mar/07 12:14 PM
> If for a given application the developer is concerned
> about safety (losing index due to crash), then the > normal default policy should be used. Spooky... what if onCommit() also deletes all commits? I added warnings about this in IndexDeletionPolicy methods. Just commiited review.take2 + these comments.
> I added warnings about this in IndexDeletionPolicy methods.
I think that's good. Thanks!
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
on how to approach this:
http://www.gossamer-threads.com/lists/lucene/java-dev/44162
http://www.gossamer-threads.com/lists/lucene/java-dev/44236
I think we've iterated to a good approach now. Here's the summary:
only on close" vs writing a segments_N every time there is a
flush, merge, etc., during a single IndexWriter session.
This means a reader won't see anything a writer has been doing
until it's closed.
We would still have an "autoCommit" true/false (default true) to
keep backwards compatibility. If true, the IndexWriter writes a
new segments_N every time it flushes, merges segments, etc.; else
it only writes one on close.
We would add an "abort()" to IndexWriter to not commit, clean up
any temp files created, and rollback.
"Commit on close" will also address / enable fixes for other
issues like prevent readers from refreshing half way through
something like "bulk delete then bulk add", preventing readers
from refreshing during optimize() thus tying up lots of disk
space, enabling a write session to be transactional (all or
none), etc.
which commits are still live and which one is pending (as the
SegmentInfos in IndexWriter, not yet written to disk).
Allow IndexFileDeleter to be subclassed to implement different
"deletion policies".
The base IndexFileDeleter class will use ref counts to figure out
which individual index files are still referenced by one or more
"segments_N" commits or by the uncommitted "in-memory"
SegmentInfos. Then the policy is invoked on commit (and also on
init) and can choose which commits (if any) to now remove.
Add constructors to IndexWriter allowing you to pass in your own
deleter. The default policy would still be "delete all past
commits as soon as a new commit is written" (this is how deleting
happens today).
For NFS we can then try different policies as discussed on those
threads above (there were at least 4 proposals). They all have
different tradeoffs. I would open separate issues for these
policies after this issue is resolved.