Lucene - Core
  1. Lucene - Core
  2. LUCENE-710

Implement "point in time" searching without relying on filesystem semantics

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 2.1
    • Fix Version/s: None
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      This was touched on in recent discussion on dev list:

      http://www.gossamer-threads.com/lists/lucene/java-dev/41700#41700

      and then more recently on the user list:

      http://www.gossamer-threads.com/lists/lucene/java-user/42088

      Lucene's "point in time" searching currently relies on how the
      underlying storage handles deletion files that are held open for
      reading.

      This is highly variable across filesystems. For example, UNIX-like
      filesystems usually do "close on last delete", and Windows filesystem
      typically refuses to delete a file open for reading (so Lucene retries
      later). But NFS just removes the file out from under the reader, and
      for that reason "point in time" searching doesn't work on NFS
      (see LUCENE-673 ).

      With the lockless commits changes (LUCENE-701 ), it's quite simple to
      re-implement "point in time searching" so as to not rely on filesystem
      semantics: we can just keep more than the last segments_N file (as
      well as all files they reference).

      This is also in keeping with the design goal of "rely on as little as
      possible from the filesystem". EG with lockless we no longer re-use
      filenames (don't rely on filesystem cache being coherent) and we no
      longer use file renaming (because on Windows it can fails). This
      would be another step of not relying on semantics of "deleting open
      files". The less we require from filesystem the more portable Lucene
      will be!

      Where it gets interesting is what "policy" we would then use for
      removing segments_N files. The policy now is "remove all but the last
      one". I think we would keep this policy as the default. Then you
      could imagine other policies:

      • Keep past N day's worth
      • Keep the last N
      • Keep only those in active use by a reader somewhere (note: tricky
        how to reliably figure this out when readers have crashed, etc.)
      • Keep those "marked" as rollback points by some transaction, or
        marked explicitly as a "snaphshot".
      • Or, roll your own: the "policy" would be an interface or abstract
        class and you could make your own implementation.

      I think for this issue we could just create the framework
      (interface/abstract class for "policy" and invoke it from
      IndexFileDeleter) and then implement the current policy (delete all
      but most recent segments_N) as the default policy.

      In separate issue(s) we could then create the above more interesting
      policies.

      I think there are some important advantages to doing this:

      • "Point in time" searching would work on NFS (it doesn't now
        because NFS doesn't do "delete on last close"; see LUCENE-673 )
        and any other Directory implementations that don't work
        currently.
      • Transactional semantics become a possibility: you can set a
        snapshot, do a bunch of stuff to your index, and then rollback to
        the snapshot at a later time.
      • If a reader crashes or machine gets rebooted, etc, it could choose
        to re-open the snapshot it had previously been using, whereas now
        the reader must always switch to the last commit point.
      • Searchers could search the same snapshot for follow-on actions.
        Meaning, user does search, then next page, drill down (Solr),
        drill up, etc. These are each separate trips to the server and if
        searcher has been re-opened, user can get inconsistent results (=
        lost trust). But with, one series of search interactions could
        explicitly stay on the snapshot it had started with.
      1. LUCENE-710.take3.patch
        163 kB
        Michael McCandless
      2. LUCENE-710.take2.patch
        162 kB
        Michael McCandless
      3. LUCENE-710.patch
        162 kB
        Michael McCandless
      4. 710.review.take2.diff
        26 kB
        Michael McCandless
      5. 710.review.diff
        25 kB
        Doron Cohen

        Activity

        Michael McCandless created issue -
        Hide
        Michael McCandless added a comment -

        There has been some great design discussions / iterations recently
        on how to approach this:

        http://www.gossamer-threads.com/lists/lucene/java-dev/44162

        http://www.gossamer-threads.com/lists/lucene/java-dev/44236

        I think we've iterated to a good approach now. Here's the summary:

        • First, add an option to IndexWriter to "commit (write segments_N)
          only on close" vs writing a segments_N every time there is a
          flush, merge, etc., during a single IndexWriter session.

        This means a reader won't see anything a writer has been doing
        until it's closed.

        We would still have an "autoCommit" true/false (default true) to
        keep backwards compatibility. If true, the IndexWriter writes a
        new segments_N every time it flushes, merges segments, etc.; else
        it only writes one on close.

        We would add an "abort()" to IndexWriter to not commit, clean up
        any temp files created, and rollback.

        "Commit on close" will also address / enable fixes for other
        issues like prevent readers from refreshing half way through
        something like "bulk delete then bulk add", preventing readers
        from refreshing during optimize() thus tying up lots of disk
        space, enabling a write session to be transactional (all or
        none), etc.

        • Second, change how IndexFileDeleter works: have it keep track of
          which commits are still live and which one is pending (as the
          SegmentInfos in IndexWriter, not yet written to disk).

        Allow IndexFileDeleter to be subclassed to implement different
        "deletion policies".

        The base IndexFileDeleter class will use ref counts to figure out
        which individual index files are still referenced by one or more
        "segments_N" commits or by the uncommitted "in-memory"
        SegmentInfos. Then the policy is invoked on commit (and also on
        init) and can choose which commits (if any) to now remove.

        Add constructors to IndexWriter allowing you to pass in your own
        deleter. The default policy would still be "delete all past
        commits as soon as a new commit is written" (this is how deleting
        happens today).

        For NFS we can then try different policies as discussed on those
        threads above (there were at least 4 proposals). They all have
        different tradeoffs. I would open separate issues for these
        policies after this issue is resolved.

        Show
        Michael McCandless added a comment - There has been some great design discussions / iterations recently on how to approach this: http://www.gossamer-threads.com/lists/lucene/java-dev/44162 http://www.gossamer-threads.com/lists/lucene/java-dev/44236 I think we've iterated to a good approach now. Here's the summary: First, add an option to IndexWriter to "commit (write segments_N) only on close" vs writing a segments_N every time there is a flush, merge, etc., during a single IndexWriter session. This means a reader won't see anything a writer has been doing until it's closed. We would still have an "autoCommit" true/false (default true) to keep backwards compatibility. If true, the IndexWriter writes a new segments_N every time it flushes, merges segments, etc.; else it only writes one on close. We would add an "abort()" to IndexWriter to not commit, clean up any temp files created, and rollback. "Commit on close" will also address / enable fixes for other issues like prevent readers from refreshing half way through something like "bulk delete then bulk add", preventing readers from refreshing during optimize() thus tying up lots of disk space, enabling a write session to be transactional (all or none), etc. Second, change how IndexFileDeleter works: have it keep track of which commits are still live and which one is pending (as the SegmentInfos in IndexWriter, not yet written to disk). Allow IndexFileDeleter to be subclassed to implement different "deletion policies". The base IndexFileDeleter class will use ref counts to figure out which individual index files are still referenced by one or more "segments_N" commits or by the uncommitted "in-memory" SegmentInfos. Then the policy is invoked on commit (and also on init) and can choose which commits (if any) to now remove. Add constructors to IndexWriter allowing you to pass in your own deleter. The default policy would still be "delete all past commits as soon as a new commit is written" (this is how deleting happens today). For NFS we can then try different policies as discussed on those threads above (there were at least 4 proposals). They all have different tradeoffs. I would open separate issues for these policies after this issue is resolved.
        Michael McCandless made changes -
        Field Original Value New Value
        Status Open [ 1 ] In Progress [ 3 ]
        Hide
        Marvin Humphrey added a comment -

        (This is a continuation of the discussion from one of the threads quoted in the previous comment, with some summations to provide context.)

        Referring to a proposal to implement advisory locking where possible, and prevent index creation on volumes/systems where the delete mechanism fails, Michael McCandless wrote:

        But what I don't like about it is it doesn't "gracefully degrade" to
        the common NFS case where locking does not work. And, this is often
        outside our user's control.

        The stratagem of publicly exposing IndexFileDeleter does not "degrade gracefully", either.

        If today's default deletions policy remains the default, then when an index is put on an NFS system, at some point search-time exceptions will occur after an update to an index. The user has to 1) detect this scenario without Lucene's help, probably from logs or user complaints 2) diagnose it without the aid of a meaningful error message, and 3) either implement their own IndexFileDeleter or choose an appropriate provided subclass, a task which will require that they grok both the innards of Lucene and the subtleties of NFS.

        The first step to graceful degradation is a meaningful error message. That means detecting a problem which normally requires both a indexing process and a search process to trigger, so we have to simulate it artificially with a test.

        1) Open a file for read/write and write a few bytes to it.
        2) Delete the test file (without closing the handle).
        3) Try to read from the handle and if a "stale NFS filehandle"
        exception is caught, throw something more informative.

        Our first opportunity to perform this test occurs at index-creation time. This is essentially cost free.

        A second opportunity arises whenever deletions are performed. Here, there's a small cost involved, and it may not be worth it, as this would only catch cases where an index was copied onto an NFS volume rather than created there, then subsequently modified.

        There's also another stratagem available to us: we can have IndexReaders establish advisory read locks against their relevant segments_N files on systems which support such locks. This won't help us to "degrade gracefully". However, it will help us degrade less often, as modern versions of NFS support file locking - but still do NOT support the "delete-on-last-close" mechanism Lucene depends on.

        Implementing advisory read locks is orthogonal to the addition of IndexFileDeleter, but diminishes the justification for adding it as a public API.

        But the good news is since we will allow subclassing to make your own
        deletion policy, we can eventually do both of these approaches and our
        users can pick one or do their own.

        The number of users this class will serve is diminishingly small. Other mitigation strategies are available.

        1) If we implement advisory read locks, many people who see this error will no longer see it. For those who do, the best option is to upgrade the OS to a version which supports advisory locks over NFS. Then an index on an NFS volume will behave as any other.
        2) If you don't actually need to put the index on an NFS volume, put it somewhere else.
        3) Catch stale NFS filehandle exceptions in your search application and refresh the reader when they occur.
        4) Maintain two copies of an index and do an rsync/switch.
        5) Hack Lucene.

        Flexibility is not free. There have been recent lamentations on java-dev about how difficult it will be to merge the write interfaces of IndexReader and IndexWriter to provide a single, unified class through which all index modifications can be performed. The exposure of the IndexFileDeleter mechanism contributes to this problem – it's one more small step in the wrong direction.

        Providing a subclassing/callback API is often an elegant strategy, and it is surely better in this case than it would be to provide a list of deletion policies for the user to select from. However, whenever possible, no API is always a better solution – especially in a case like this one, where the functionality provided has nothing to do with Lucene's core mission and is there solely to work around an implmentation-specific bug.

        Show
        Marvin Humphrey added a comment - (This is a continuation of the discussion from one of the threads quoted in the previous comment, with some summations to provide context.) Referring to a proposal to implement advisory locking where possible, and prevent index creation on volumes/systems where the delete mechanism fails, Michael McCandless wrote: But what I don't like about it is it doesn't "gracefully degrade" to the common NFS case where locking does not work. And, this is often outside our user's control. The stratagem of publicly exposing IndexFileDeleter does not "degrade gracefully", either. If today's default deletions policy remains the default, then when an index is put on an NFS system, at some point search-time exceptions will occur after an update to an index. The user has to 1) detect this scenario without Lucene's help, probably from logs or user complaints 2) diagnose it without the aid of a meaningful error message, and 3) either implement their own IndexFileDeleter or choose an appropriate provided subclass, a task which will require that they grok both the innards of Lucene and the subtleties of NFS. The first step to graceful degradation is a meaningful error message. That means detecting a problem which normally requires both a indexing process and a search process to trigger, so we have to simulate it artificially with a test. 1) Open a file for read/write and write a few bytes to it. 2) Delete the test file (without closing the handle). 3) Try to read from the handle and if a "stale NFS filehandle" exception is caught, throw something more informative. Our first opportunity to perform this test occurs at index-creation time. This is essentially cost free. A second opportunity arises whenever deletions are performed. Here, there's a small cost involved, and it may not be worth it, as this would only catch cases where an index was copied onto an NFS volume rather than created there, then subsequently modified. There's also another stratagem available to us: we can have IndexReaders establish advisory read locks against their relevant segments_N files on systems which support such locks. This won't help us to "degrade gracefully". However, it will help us degrade less often, as modern versions of NFS support file locking - but still do NOT support the "delete-on-last-close" mechanism Lucene depends on. Implementing advisory read locks is orthogonal to the addition of IndexFileDeleter, but diminishes the justification for adding it as a public API. But the good news is since we will allow subclassing to make your own deletion policy, we can eventually do both of these approaches and our users can pick one or do their own. The number of users this class will serve is diminishingly small. Other mitigation strategies are available. 1) If we implement advisory read locks, many people who see this error will no longer see it. For those who do, the best option is to upgrade the OS to a version which supports advisory locks over NFS. Then an index on an NFS volume will behave as any other. 2) If you don't actually need to put the index on an NFS volume, put it somewhere else. 3) Catch stale NFS filehandle exceptions in your search application and refresh the reader when they occur. 4) Maintain two copies of an index and do an rsync/switch. 5) Hack Lucene. Flexibility is not free. There have been recent lamentations on java-dev about how difficult it will be to merge the write interfaces of IndexReader and IndexWriter to provide a single, unified class through which all index modifications can be performed. The exposure of the IndexFileDeleter mechanism contributes to this problem – it's one more small step in the wrong direction. Providing a subclassing/callback API is often an elegant strategy, and it is surely better in this case than it would be to provide a list of deletion policies for the user to select from. However, whenever possible, no API is always a better solution – especially in a case like this one, where the functionality provided has nothing to do with Lucene's core mission and is there solely to work around an implmentation-specific bug.
        Hide
        Marvin Humphrey added a comment -

        There are two sections in the previous comment that are supposed to be quoted, but which are not because JIRA ate the email-style "greater than" quoting. They are:

        "But what I don't like about it is it doesn't "gracefully degrade" to
        the common NFS case where locking does not work. And, this is often
        outside our user's control."

        "But the good news is since we will allow subclassing to make your own
        deletion policy, we can eventually do both of these approaches and our
        users can pick one or do their own."

        I wish it were possible to edit that note, as it's really confusing as things stand. A preview mechanism would have been handy.

        Show
        Marvin Humphrey added a comment - There are two sections in the previous comment that are supposed to be quoted, but which are not because JIRA ate the email-style "greater than" quoting. They are: "But what I don't like about it is it doesn't "gracefully degrade" to the common NFS case where locking does not work. And, this is often outside our user's control." "But the good news is since we will allow subclassing to make your own deletion policy, we can eventually do both of these approaches and our users can pick one or do their own." I wish it were possible to edit that note, as it's really confusing as things stand. A preview mechanism would have been handy.
        Hide
        Doron Cohen added a comment -

        > * Second, change how IndexFileDeleter works: have it keep track of
        > which commits are still live and which one is pending (as the
        > SegmentInfos in IndexWriter, not yet written to disk).
        >
        > Allow IndexFileDeleter to be subclassed to implement different
        > "deletion policies".
        >
        > The base IndexFileDeleter class will use ref counts to figure out
        > which individual index files are still referenced by one or more
        > "segments_N" commits or by the uncommitted "in-memory"
        > SegmentInfos. Then the policy is invoked on commit (and also on
        > init) and can choose which commits (if any) to now remove.
        >
        > Add constructors to IndexWriter allowing you to pass in your own
        > deleter. The default policy would still be "delete all past
        > commits as soon as a new commit is written" (this is how deleting
        > happens today).
        >
        > For NFS we can then try different policies as discussed on those
        > threads above (there were at least 4 proposals). They all have
        > different tradeoffs. I would open separate issues for these
        > policies after this issue is resolved.
        >

        This ties solving the NFS issue with an extendable-file-deletion policy.
        I am wondering is this the right way, or, perhaps, should the reference
        counting be considered alone, apart from the deletion policy.
        (Would modifying IndexFileDeleter to base on ref-count make it simpler
        or harder to maintain?)

        Also, IndexFileDeleter is doing delicate work - not sure you want
        applications to mess with it. Better let applications control some
        simple well defined behavior, maybe the same way that a sorter
        allows applications to provide a comparator, but keeps the sorting
        algorithm for itself.

        Back to reference counting,- how about the following approach:

        • Add to Directory a FileReferenceCounter data member, get()/set() etc.
        • Add a class FileReferenceCounter with simple general methods:
          void increment (String name)
          void decrement (String name)
          int getRefCount (String name)
        • Default implementation would do nothing, i.e. would not record
          references, and always return 0.
        • IndexReader, upon opening a segment, would call increment(segName)
        • IndexReader, upon closing a segment, would call decrement(segName)
        • IndexFileDeleter, before removing a file belonging to a certain segment,
          would verify getRefCount(segName)==0.
        • Notice that the FilereferenceCounter is available from the Directory,
          so no constructors should be added to IndexWriter/Reader.

        So, this is adding to Directory a general file utility, no knowledge of
        index structure required in Directory. Also, IndexFileDeleter can remain
        as today, and at some later point can be made more powerful with various
        deletion policies - but those policies remain unrelated to the NFS
        issue - they can focus on point-in-time issues, where I think it
        stemmed from.

        An NFS geared FileReferenceCounter would then be able to keep alive
        "counter files", name those files based on counted fileName plus
        processID plus machID, base getRefCount on safety window since file
        was last touched, etc. All this is left out from point-in-time
        policies (how many/time points-in-time should be retained).

        Show
        Doron Cohen added a comment - > * Second, change how IndexFileDeleter works: have it keep track of > which commits are still live and which one is pending (as the > SegmentInfos in IndexWriter, not yet written to disk). > > Allow IndexFileDeleter to be subclassed to implement different > "deletion policies". > > The base IndexFileDeleter class will use ref counts to figure out > which individual index files are still referenced by one or more > "segments_N" commits or by the uncommitted "in-memory" > SegmentInfos. Then the policy is invoked on commit (and also on > init) and can choose which commits (if any) to now remove. > > Add constructors to IndexWriter allowing you to pass in your own > deleter. The default policy would still be "delete all past > commits as soon as a new commit is written" (this is how deleting > happens today). > > For NFS we can then try different policies as discussed on those > threads above (there were at least 4 proposals). They all have > different tradeoffs. I would open separate issues for these > policies after this issue is resolved. > This ties solving the NFS issue with an extendable-file-deletion policy. I am wondering is this the right way, or, perhaps, should the reference counting be considered alone, apart from the deletion policy. (Would modifying IndexFileDeleter to base on ref-count make it simpler or harder to maintain?) Also, IndexFileDeleter is doing delicate work - not sure you want applications to mess with it. Better let applications control some simple well defined behavior, maybe the same way that a sorter allows applications to provide a comparator, but keeps the sorting algorithm for itself. Back to reference counting,- how about the following approach: Add to Directory a FileReferenceCounter data member, get()/set() etc. Add a class FileReferenceCounter with simple general methods: void increment (String name) void decrement (String name) int getRefCount (String name) Default implementation would do nothing, i.e. would not record references, and always return 0. IndexReader, upon opening a segment, would call increment(segName) IndexReader, upon closing a segment, would call decrement(segName) IndexFileDeleter, before removing a file belonging to a certain segment, would verify getRefCount(segName)==0. Notice that the FilereferenceCounter is available from the Directory, so no constructors should be added to IndexWriter/Reader. So, this is adding to Directory a general file utility, no knowledge of index structure required in Directory. Also, IndexFileDeleter can remain as today, and at some later point can be made more powerful with various deletion policies - but those policies remain unrelated to the NFS issue - they can focus on point-in-time issues, where I think it stemmed from. An NFS geared FileReferenceCounter would then be able to keep alive "counter files", name those files based on counted fileName plus processID plus machID, base getRefCount on safety window since file was last touched, etc. All this is left out from point-in-time policies (how many/time points-in-time should be retained).
        Hide
        Michael McCandless added a comment -

        OK, a few top level summary comments and then some specifics below:

        • I don't want to modify the Lucene core (adding advisory read
          locks, etc) just to handle the NFS case. That risks hurting the
          non-NFS cases. "First do no harm." "Keep it simple." We've been
          working hard to remove locking lately (lockless commits) and I
          rather not add more locking back.

        By implementing each approach (I think there are now 5 different
        ideas now) instead as its own deletion policy (subclass of
        IndexFileDeleter) we contain the added complexity of locking,
        file-based ref counts, etc, to just that one subclass of
        IndexFileDeleter.

        • I think NFS support is part of Lucene's core mission.

        Lucene should try hard to be as portable as possible.

        NFS is used alot.

        It's tempting to tell users to "upgrade OS", "upgrade NFS server
        and/or client", etc, but taking that approach will only hurt our
        users because typically this is not something they can control.

        Now, if we had to bend over backwards for NFS, then I would agree
        it's not worth it. But, we don't have to: by allowing custom
        deletion policies (which is a minor change) we can try out all of
        the approaches suggested so far on this thread.

        Rather than baking any one of these approaches into the Lucene
        core, I'd rather just enable "custom deletion policies" then
        people can build out these polices outside of the core (eg in
        "contrib" first).

        • I agree that "giving a good error message when index is on NFS" is
          really important and that custom deletion policy alone doesn't
          address this.

        Marvin I don't think your test will work either (see below).

        But I really like this direction: does anyone know how (Java
        friendly way) to determine that a given directory is on an NFS
        mount? That would be wonderful. I will spin off a new thread
        here.

        Some specifics below:

        Marvin Humphrey wrote:

        > The first step to graceful degradation is a meaningful error message.
        That means detecting a problem which normally requires both a indexing
        process and a search process to trigger, so we have to simulate it
        artificially with a test.
        >
        > 1) Open a file for read/write and write a few bytes to it.
        > 2) Delete the test file (without closing the handle).
        > 3) Try to read from the handle and if a "stale NFS filehandle"
        > exception is caught, throw something more informative.
        >
        > Our first opportunity to perform this test occurs at index-creation
        time. This is essentially cost free.
        >
        > A second opportunity arises whenever deletions are performed. Here,
        there's a small cost involved, and it may not be worth it, as this would
        only catch cases where an index was copied onto an NFS volume rather
        than created there, then subsequently modified.

        I think this test won't work (though I haven't tested...). Typically
        an NFS client will catch this case and locally emulate "delete on last
        close". Worse, even if it doesn't, those bytes would likely be cached
        and then would fail to hit "stale NFS filehandle".

        > But the good news is since we will allow subclassing to make your own
        > deletion policy, we can eventually do both of these approaches and our
        > users can pick one or do their own.
        >
        > The number of users this class will serve is diminishingly small.
        Other mitigation strategies are available.
        >
        > 1) If we implement advisory read locks, many people who see this
        error will no longer see it. For those who do, the best option is to
        upgrade the OS to a version which supports advisory locks over NFS.
        Then an index on an NFS volume will behave as any other.
        > 2) If you don't actually need to put the index on an NFS volume, put
        it somewhere else.
        > 3) Catch stale NFS filehandle exceptions in your search application
        and refresh the reader when they occur.
        > 4) Maintain two copies of an index and do an rsync/switch.
        > 5) Hack Lucene.

        5) isn't really a good option since we all can't even agree how to
        "hack Lucene" to make this work! 1) I think is too dangerous as part
        of the core. 2) typically this is not an option. People choose NFS
        because they want to share the index. 4) is a fair amount of added
        complexity. 3) is the most viable option I see here, but it's not
        great because you're forced to refresh "right now". What if warming
        takes 8 minutes? What if "now" is a bad time because deletes were
        done by the writer but not yet adds?

        > Flexibility is not free. There have been recent lamentations on
        java-dev about how difficult it will be to merge the write interfaces of
        IndexReader and IndexWriter to provide a single, unified class through
        which all index modifications can be performed. The exposure of the
        IndexFileDeleter mechanism contributes to this problem – it's one more
        small step in the wrong direction.

        Yes there is an open question now on what to do about the confusion on
        using IndexReader vs IndexWriter. I think moving towards "use
        IndexWriter for changes, use IndexReader for reading" is the best
        solution here. But I don't see how this relates to allowing
        subclassing of IndexFileDeleter to make your own deletion policy.

        > Providing a subclassing/callback API is often an elegant strategy,
        and it is surely better in this case than it would be to provide a list
        of deletion policies for the user to select from. However, whenever
        possible, no API is always a better solution – especially in a case
        like this one, where the functionality provided has nothing to do with
        Lucene's core mission and is there solely to work around an
        implmentation-specific bug.

        I disagree on this point ("no" API is better than subclassing). As
        you've said, this issue won't affect that many people (though I think
        it's a fairly large subset of our users). Given that, I would not
        want to add file locking & additional complexity into the Lucene core,
        just to handle NFS.

        By allowing a different delete policy as a subclass of
        IndexFileDeleter we keep the changes required for supporting NFS way
        outside the Lucene core. Since there's so much debate about which
        deletion policy is best we should create all of these in contrib to
        begin with and if something proves reliable we can eventually promote
        it into core Lucene.

        I think subclassing is perfect for this sort of situation. It's like
        the various LockFactory implementations we have: there is no "one size
        fits all".

        Mike

        Show
        Michael McCandless added a comment - OK, a few top level summary comments and then some specifics below: I don't want to modify the Lucene core (adding advisory read locks, etc) just to handle the NFS case. That risks hurting the non-NFS cases. "First do no harm." "Keep it simple." We've been working hard to remove locking lately (lockless commits) and I rather not add more locking back. By implementing each approach (I think there are now 5 different ideas now) instead as its own deletion policy (subclass of IndexFileDeleter) we contain the added complexity of locking, file-based ref counts, etc, to just that one subclass of IndexFileDeleter. I think NFS support is part of Lucene's core mission. Lucene should try hard to be as portable as possible. NFS is used alot . It's tempting to tell users to "upgrade OS", "upgrade NFS server and/or client", etc, but taking that approach will only hurt our users because typically this is not something they can control. Now, if we had to bend over backwards for NFS, then I would agree it's not worth it. But, we don't have to: by allowing custom deletion policies (which is a minor change) we can try out all of the approaches suggested so far on this thread. Rather than baking any one of these approaches into the Lucene core, I'd rather just enable "custom deletion policies" then people can build out these polices outside of the core (eg in "contrib" first). I agree that "giving a good error message when index is on NFS" is really important and that custom deletion policy alone doesn't address this. Marvin I don't think your test will work either (see below). But I really like this direction: does anyone know how (Java friendly way) to determine that a given directory is on an NFS mount? That would be wonderful. I will spin off a new thread here. Some specifics below: Marvin Humphrey wrote: > The first step to graceful degradation is a meaningful error message. That means detecting a problem which normally requires both a indexing process and a search process to trigger, so we have to simulate it artificially with a test. > > 1) Open a file for read/write and write a few bytes to it. > 2) Delete the test file (without closing the handle). > 3) Try to read from the handle and if a "stale NFS filehandle" > exception is caught, throw something more informative. > > Our first opportunity to perform this test occurs at index-creation time. This is essentially cost free. > > A second opportunity arises whenever deletions are performed. Here, there's a small cost involved, and it may not be worth it, as this would only catch cases where an index was copied onto an NFS volume rather than created there, then subsequently modified. I think this test won't work (though I haven't tested...). Typically an NFS client will catch this case and locally emulate "delete on last close". Worse, even if it doesn't, those bytes would likely be cached and then would fail to hit "stale NFS filehandle". > But the good news is since we will allow subclassing to make your own > deletion policy, we can eventually do both of these approaches and our > users can pick one or do their own. > > The number of users this class will serve is diminishingly small. Other mitigation strategies are available. > > 1) If we implement advisory read locks, many people who see this error will no longer see it. For those who do, the best option is to upgrade the OS to a version which supports advisory locks over NFS. Then an index on an NFS volume will behave as any other. > 2) If you don't actually need to put the index on an NFS volume, put it somewhere else. > 3) Catch stale NFS filehandle exceptions in your search application and refresh the reader when they occur. > 4) Maintain two copies of an index and do an rsync/switch. > 5) Hack Lucene. 5) isn't really a good option since we all can't even agree how to "hack Lucene" to make this work! 1) I think is too dangerous as part of the core. 2) typically this is not an option. People choose NFS because they want to share the index. 4) is a fair amount of added complexity. 3) is the most viable option I see here, but it's not great because you're forced to refresh "right now". What if warming takes 8 minutes? What if "now" is a bad time because deletes were done by the writer but not yet adds? > Flexibility is not free. There have been recent lamentations on java-dev about how difficult it will be to merge the write interfaces of IndexReader and IndexWriter to provide a single, unified class through which all index modifications can be performed. The exposure of the IndexFileDeleter mechanism contributes to this problem – it's one more small step in the wrong direction. Yes there is an open question now on what to do about the confusion on using IndexReader vs IndexWriter. I think moving towards "use IndexWriter for changes, use IndexReader for reading" is the best solution here. But I don't see how this relates to allowing subclassing of IndexFileDeleter to make your own deletion policy. > Providing a subclassing/callback API is often an elegant strategy, and it is surely better in this case than it would be to provide a list of deletion policies for the user to select from. However, whenever possible, no API is always a better solution – especially in a case like this one, where the functionality provided has nothing to do with Lucene's core mission and is there solely to work around an implmentation-specific bug. I disagree on this point ("no" API is better than subclassing). As you've said, this issue won't affect that many people (though I think it's a fairly large subset of our users). Given that, I would not want to add file locking & additional complexity into the Lucene core, just to handle NFS. By allowing a different delete policy as a subclass of IndexFileDeleter we keep the changes required for supporting NFS way outside the Lucene core. Since there's so much debate about which deletion policy is best we should create all of these in contrib to begin with and if something proves reliable we can eventually promote it into core Lucene. I think subclassing is perfect for this sort of situation. It's like the various LockFactory implementations we have: there is no "one size fits all". Mike
        Hide
        Michael McCandless added a comment -

        Doron Cohen wrote:

        > This ties solving the NFS issue with an extendable-file-deletion policy.
        > I am wondering is this the right way, or, perhaps, should the reference
        > counting be considered alone, apart from the deletion policy.
        > (Would modifying IndexFileDeleter to base on ref-count make it simpler
        > or harder to maintain?)
        >
        > Also, IndexFileDeleter is doing delicate work - not sure you want
        > applications to mess with it. Better let applications control some
        > simple well defined behavior, maybe the same way that a sorter
        > allows applications to provide a comparator, but keeps the sorting
        > algorithm for itself.

        The solution I have in mind abstracts away all tricky details of
        deleting files. EG something like:

        public class OnlyLastCommitDeleter extends IndexFileDeleter {

        void onInit(List commits)

        { onCommit(commits); }

        void onCommit(List commits) {
        if (commits.size() > 1) {
        for(int i=0;i<commits.size()-1;i++)

        { deleteCommit(commits.get(i)); }

        }
        }

        Ie, the sole responsibility of the IndexFileDeleter subclass (policy)
        is to decide when to delete a commit. The rest of the details
        (figuring out what actual files can be deleted now that a given commit
        segments_N is deleted) are handled by the base class (with in-memory
        ref counting).

        > Back to reference counting,- how about the following approach:
        > - Add to Directory a FileReferenceCounter data member, get()/set() etc.
        > - Add a class FileReferenceCounter with simple general methods:
        > void increment (String name)
        > void decrement (String name)
        > int getRefCount (String name)
        > - Default implementation would do nothing, i.e. would not record
        > references, and always return 0.
        > - IndexReader, upon opening a segment, would call increment(segName)
        > - IndexReader, upon closing a segment, would call decrement(segName)
        > - IndexFileDeleter, before removing a file belonging to a certain segment,
        > would verify getRefCount(segName)==0.
        > - Notice that the FilereferenceCounter is available from the Directory,
        > so no constructors should be added to IndexWriter/Reader.
        >
        > So, this is adding to Directory a general file utility, no knowledge of
        > index structure required in Directory. Also, IndexFileDeleter can remain
        > as today, and at some later point can be made more powerful with various
        > deletion policies - but those policies remain unrelated to the NFS
        > issue - they can focus on point-in-time issues, where I think it
        > stemmed from.
        >
        > An NFS geared FileReferenceCounter would then be able to keep alive
        > "counter files", name those files based on counted fileName plus
        > processID plus machID, base getRefCount on safety window since file
        > was last touched, etc. All this is left out from point-in-time
        > policies (how many/time points-in-time should be retained).

        I think this approach could work, but, rather than implementing in the
        Lucene core (adding methods to Directory) I'd like to see it tested as
        a custom deletion policy + wrappers around IndexReader
        creation/destruction.

        We have so much debate about the best "deletion policy" for NFS that
        I'd like to make the minimal extension to the core (ability to make
        your own "deletion policy") and then people can build their own and
        try them out.

        Mike

        Show
        Michael McCandless added a comment - Doron Cohen wrote: > This ties solving the NFS issue with an extendable-file-deletion policy. > I am wondering is this the right way, or, perhaps, should the reference > counting be considered alone, apart from the deletion policy. > (Would modifying IndexFileDeleter to base on ref-count make it simpler > or harder to maintain?) > > Also, IndexFileDeleter is doing delicate work - not sure you want > applications to mess with it. Better let applications control some > simple well defined behavior, maybe the same way that a sorter > allows applications to provide a comparator, but keeps the sorting > algorithm for itself. The solution I have in mind abstracts away all tricky details of deleting files. EG something like: public class OnlyLastCommitDeleter extends IndexFileDeleter { void onInit(List commits) { onCommit(commits); } void onCommit(List commits) { if (commits.size() > 1) { for(int i=0;i<commits.size()-1;i++) { deleteCommit(commits.get(i)); } } } Ie, the sole responsibility of the IndexFileDeleter subclass (policy) is to decide when to delete a commit. The rest of the details (figuring out what actual files can be deleted now that a given commit segments_N is deleted) are handled by the base class (with in-memory ref counting). > Back to reference counting,- how about the following approach: > - Add to Directory a FileReferenceCounter data member, get()/set() etc. > - Add a class FileReferenceCounter with simple general methods: > void increment (String name) > void decrement (String name) > int getRefCount (String name) > - Default implementation would do nothing, i.e. would not record > references, and always return 0. > - IndexReader, upon opening a segment, would call increment(segName) > - IndexReader, upon closing a segment, would call decrement(segName) > - IndexFileDeleter, before removing a file belonging to a certain segment, > would verify getRefCount(segName)==0. > - Notice that the FilereferenceCounter is available from the Directory, > so no constructors should be added to IndexWriter/Reader. > > So, this is adding to Directory a general file utility, no knowledge of > index structure required in Directory. Also, IndexFileDeleter can remain > as today, and at some later point can be made more powerful with various > deletion policies - but those policies remain unrelated to the NFS > issue - they can focus on point-in-time issues, where I think it > stemmed from. > > An NFS geared FileReferenceCounter would then be able to keep alive > "counter files", name those files based on counted fileName plus > processID plus machID, base getRefCount on safety window since file > was last touched, etc. All this is left out from point-in-time > policies (how many/time points-in-time should be retained). I think this approach could work, but, rather than implementing in the Lucene core (adding methods to Directory) I'd like to see it tested as a custom deletion policy + wrappers around IndexReader creation/destruction. We have so much debate about the best "deletion policy" for NFS that I'd like to make the minimal extension to the core (ability to make your own "deletion policy") and then people can build their own and try them out. Mike
        Hide
        Michael McCandless added a comment -

        One clarification on "different deletion policies": to support "commit
        on close" in IndexWriter, I already have to improve IndexFileDeleter
        to give it a different deletion policy than the current one.

        Specifically, the deleter must not delete anything referenced by the
        last commit nor anything referenced by the in-memory SegmentInfos.

        For example, if a writer is opened with autoCommit=false ("commit on
        close") on an existing index, and lots of docs are added/deleted/etc,
        there will have been flushes & merges of segments. The deletion
        policy should not delete anything that existed "at the start" because
        it's referenced by the segments_N commit, nor anything that is now
        referenced by the in-memory SegmentInfos. But it should delete
        anything "in between" (any newly written segments that have now been
        merged away).

        To the deleter this would just be a different policy, one that keeps
        two SegmentInfos alive (one on disk and one not yet committed, in
        memory). And the default deletion policy would be selected depending
        on whether the writer is in autoCommit mode or not.

        Show
        Michael McCandless added a comment - One clarification on "different deletion policies": to support "commit on close" in IndexWriter, I already have to improve IndexFileDeleter to give it a different deletion policy than the current one. Specifically, the deleter must not delete anything referenced by the last commit nor anything referenced by the in-memory SegmentInfos. For example, if a writer is opened with autoCommit=false ("commit on close") on an existing index, and lots of docs are added/deleted/etc, there will have been flushes & merges of segments. The deletion policy should not delete anything that existed "at the start" because it's referenced by the segments_N commit, nor anything that is now referenced by the in-memory SegmentInfos. But it should delete anything "in between" (any newly written segments that have now been merged away). To the deleter this would just be a different policy, one that keeps two SegmentInfos alive (one on disk and one not yet committed, in memory). And the default deletion policy would be selected depending on whether the writer is in autoCommit mode or not.
        Hide
        Marvin Humphrey added a comment -

        On Jan 19, 2007, at 2:04 PM, Michael McCandless (JIRA) wrote:
        > * I think NFS support is part of Lucene's core mission.

        When I asserted that IndexFileDeleter had nothing to do with Lucene's core
        mission, I meant: you don't use Lucene to build yourself an app which helps
        you delete files.

        > Yes there is an open question now on what to do about the confusion on using
        > IndexReader vs IndexWriter. I think moving towards "use IndexWriter for
        > changes, use IndexReader for reading" is the best solution here. But I
        > don't see how this relates to allowing subclassing of IndexFileDeleter to
        > make your own deletion policy.

        They're hard to refactor because they're both big, hence adding either code or
        API commitments to them should be avoided when possible. We're in agreement
        about the desirability of simplicity. We part ways in how we measure
        simplicity: I give greater emphasis to simplicity of API design.

        > I disagree on this point ("no" API is better than subclassing).

        We're talking past each other. I was generalizing: a 100% successful, purely
        internal "black box" solution is always better than a solution that involves
        the user.

        > I would not want to add file locking & additional complexity into the Lucene
        > core, just to handle NFS.

        This is where our differing priorities manifest. I would rather add some
        code complexity to the Lucene "core" than accept the increased
        support burden of an expanded API.

        Ironically, though you consider supporting NFS "part of Lucene's core
        mission", for the average user your proposal as it stands is not very
        user-friendly. People like Chuck, Doron, and Robert will have no trouble with
        it, but if you're a newcomer to Lucene and you "just want to put an index on
        NFS", subclassing IndexFileDeleter will pose a challenge.

        I also think you may be over-estimating the amount of effort it will take to
        exploit advisory read locks. (The vexing problem of how to issue a warning
        when index creation is attempted on an NFS volume is orthogonal to the
        read-locks approach as well.) They should be easy in KS; I'll know soon enough.
        However, there are some OO discipline issues which complicate applying what I
        have in mind to Java Lucene. In KS, the public API is defined solely via
        documentation, so I can have code in Index call methods from Store without
        having to expose it. With Lucene divided into multiple packages, that's a
        problem.

        > I think subclassing is perfect for this sort of situation.

        I'm not so enamored of subclassing. It's less constraining than some other
        approaches, but it's still constraining.

        Case in point: it's not possible to provide a subclass of IndexFileDeleter
        which exploits advisory read locking under your current proposal.

        In theory, your proposal even prevents the addition of such read locks to
        Lucene later, because doing so could conflict with a deletions policy you've
        allowed the user to set. (; Given that locks over NFS make you "nervous",
        perhaps you consider foreclosing that option a feature.

        > It's like the various LockFactory implementations we have: there is no "one
        > size fits all".

        I don't think LockFactory ought to be exposed either.

        Reading from an index – any index, on any volume – should Just Work.
        Writing to an index from a single IndexWriter should Just Work. In a perfect
        world, writing from multiple writers simultaneously would Just Work, too, but
        as that's inherently impractical given Lucene's current design, opening a
        second writer must fail. That failure should be the only visible evidence
        that a locking mechanism even exists.

        In my view, any deviance from that ideal API due to implementation defects
        should be handled with exceptions rather than API additions.

        In keeping with this philosophy, Lock is not publicly exposed in KinoSearch.
        In fact, nothing about the locking mechanism is publicly exposed. So far,
        there seem to be three bugs with the current implementation:

        • Stale NFS Filehandle exceptions.
        • Stale lock files interfering with unattended indexing sessions. I plan
          to mitigate this by moving to advisory write locks when possible.
        • Multiple machines can cause index corruption when attempting to write
          simultaneously to a shared volume. Moving the write.lock file to the
          index directory, as enabled by lockless commits, solves this problem.

        Once advisory file locks are in place, and if they work as expected under
        recent versions of NFS, I expect no further problems under any common, recent
        Unix.

        With the switch to lockless commits in KinoSearch, I've able to refactor Lock
        and eliminate all of Lock's subclasses, simplifying the KinoSearch "core".
        "No more subclassing of Lock" was originally a line-item in my list of
        "great stuff" about lockless commits, but I had to take it out because it
        wasn't true for Lucene!

        With Otis signing on to your solution, it looks like momentum is gathering for
        it. For the record, I don't think it's a catastrophic change, just suboptimal
        and IMO not ready for addition until improved.

        I think you can do better.

        Show
        Marvin Humphrey added a comment - On Jan 19, 2007, at 2:04 PM, Michael McCandless (JIRA) wrote: > * I think NFS support is part of Lucene's core mission. When I asserted that IndexFileDeleter had nothing to do with Lucene's core mission, I meant: you don't use Lucene to build yourself an app which helps you delete files. > Yes there is an open question now on what to do about the confusion on using > IndexReader vs IndexWriter. I think moving towards "use IndexWriter for > changes, use IndexReader for reading" is the best solution here. But I > don't see how this relates to allowing subclassing of IndexFileDeleter to > make your own deletion policy. They're hard to refactor because they're both big, hence adding either code or API commitments to them should be avoided when possible. We're in agreement about the desirability of simplicity. We part ways in how we measure simplicity: I give greater emphasis to simplicity of API design. > I disagree on this point ("no" API is better than subclassing). We're talking past each other. I was generalizing: a 100% successful, purely internal "black box" solution is always better than a solution that involves the user. > I would not want to add file locking & additional complexity into the Lucene > core, just to handle NFS. This is where our differing priorities manifest. I would rather add some code complexity to the Lucene "core" than accept the increased support burden of an expanded API. Ironically, though you consider supporting NFS "part of Lucene's core mission", for the average user your proposal as it stands is not very user-friendly. People like Chuck, Doron, and Robert will have no trouble with it, but if you're a newcomer to Lucene and you "just want to put an index on NFS", subclassing IndexFileDeleter will pose a challenge. I also think you may be over-estimating the amount of effort it will take to exploit advisory read locks. (The vexing problem of how to issue a warning when index creation is attempted on an NFS volume is orthogonal to the read-locks approach as well.) They should be easy in KS; I'll know soon enough. However, there are some OO discipline issues which complicate applying what I have in mind to Java Lucene. In KS, the public API is defined solely via documentation, so I can have code in Index call methods from Store without having to expose it. With Lucene divided into multiple packages, that's a problem. > I think subclassing is perfect for this sort of situation. I'm not so enamored of subclassing. It's less constraining than some other approaches, but it's still constraining. Case in point: it's not possible to provide a subclass of IndexFileDeleter which exploits advisory read locking under your current proposal. In theory, your proposal even prevents the addition of such read locks to Lucene later, because doing so could conflict with a deletions policy you've allowed the user to set. (; Given that locks over NFS make you "nervous", perhaps you consider foreclosing that option a feature. > It's like the various LockFactory implementations we have: there is no "one > size fits all". I don't think LockFactory ought to be exposed either. Reading from an index – any index, on any volume – should Just Work. Writing to an index from a single IndexWriter should Just Work. In a perfect world, writing from multiple writers simultaneously would Just Work, too, but as that's inherently impractical given Lucene's current design, opening a second writer must fail. That failure should be the only visible evidence that a locking mechanism even exists. In my view, any deviance from that ideal API due to implementation defects should be handled with exceptions rather than API additions. In keeping with this philosophy, Lock is not publicly exposed in KinoSearch. In fact, nothing about the locking mechanism is publicly exposed. So far, there seem to be three bugs with the current implementation: Stale NFS Filehandle exceptions. Stale lock files interfering with unattended indexing sessions. I plan to mitigate this by moving to advisory write locks when possible. Multiple machines can cause index corruption when attempting to write simultaneously to a shared volume. Moving the write.lock file to the index directory, as enabled by lockless commits, solves this problem. Once advisory file locks are in place, and if they work as expected under recent versions of NFS, I expect no further problems under any common, recent Unix. With the switch to lockless commits in KinoSearch, I've able to refactor Lock and eliminate all of Lock's subclasses, simplifying the KinoSearch "core". "No more subclassing of Lock" was originally a line-item in my list of "great stuff" about lockless commits, but I had to take it out because it wasn't true for Lucene! With Otis signing on to your solution, it looks like momentum is gathering for it. For the record, I don't think it's a catastrophic change, just suboptimal and IMO not ready for addition until improved. I think you can do better.
        Hide
        Michael McCandless added a comment -

        Quick summary first:

        OK, as you said (and I agree) I think we just have a difference of
        opinion on what's the "lesser evil" tradeoff here. You would prefer to
        change the core of KinoSearch to always use advisory read locks for
        all indices.

        Whereas I prefer to leave the Lucene core untouched since things work
        fine in most cases today ("if it ain't broke don't fix it"), and then
        open up an API so for those cases (NFS) where it doesn't work, users
        at least have possible solutions to try.

        I think you also have a high confidence that the locking approach will
        work fine (be perfect) on the first go and will not alienate too many
        users, but I don't: I have had problems with locking in the past and I
        think most users don't have the freedom to "upgrade OS/fileserver".

        So I would prefer instead to open a minimal API in the core of Lucene
        (so users can use different deletion policies), and then try the 5
        different ideas proposed so far (and more later I'm sure) as their own
        deletion policy, external to Lucene's core (eg in contrib). If one of
        them proves to work well, universally, then sometime down the road we
        can promote it as the default deletion policy.

        OK details below:

        > > * I think NFS support is part of Lucene's core mission.
        >
        > When I asserted that IndexFileDeleter had nothing to do with Lucene's core
        > mission, I meant: you don't use Lucene to build yourself an app which helps
        > you delete files.

        Well, "custom deletion policies" is in support of the core mission of
        "working over NFS".

        > > Yes there is an open question now on what to do about the confusion on using
        > > IndexReader vs IndexWriter. I think moving towards "use IndexWriter for
        > > changes, use IndexReader for reading" is the best solution here. But I
        > > don't see how this relates to allowing subclassing of IndexFileDeleter to
        > > make your own deletion policy.
        >
        > They're hard to refactor because they're both big, hence adding either code or
        > API commitments to them should be avoided when possible. We're in agreement
        > about the desirability of simplicity. We part ways in how we measure
        > simplicity: I give greater emphasis to simplicity of API design.
        >
        > > I disagree on this point ("no" API is better than subclassing).
        >
        > We're talking past each other. I was generalizing: a 100% successful, purely
        > internal "black box" solution is always better than a solution that involves
        > the user.

        OK, yes in the ideal case, no API is better than API if your situation
        allows for no API. I just don't think this is one of those
        situations: I don't think we have a clear cut "one size fits all"
        solution.

        > > I would not want to add file locking & additional complexity into the Lucene
        > > core, just to handle NFS.
        >
        > This is where our differing priorities manifest. I would rather add some
        > code complexity to the Lucene "core" than accept the increased
        > support burden of an expanded API.
        >
        > Ironically, though you consider supporting NFS "part of Lucene's core
        > mission", for the average user your proposal as it stands is not very
        > user-friendly. People like Chuck, Doron, and Robert will have no trouble with
        > it, but if you're a newcomer to Lucene and you "just want to put an index on
        > NFS", subclassing IndexFileDeleter will pose a challenge.

        Yes, but at least having the option (picking one of the deletion
        policies in "contrib" once we've built them out) is quite a bit better
        than what we have today (no option besides "you must refresh now"). I
        would love to have the "perfect" solution (which you are aiming for in
        one step), but I'll settle today for just good progress: "progress not
        perfection".

        > I also think you may be over-estimating the amount of effort it will take to
        > exploit advisory read locks. (The vexing problem of how to issue a warning
        > when index creation is attempted on an NFS volume is orthogonal to the
        > read-locks approach as well.) They should be easy in KS; I'll know soon enough.
        > However, there are some OO discipline issues which complicate applying what I
        > have in mind to Java Lucene. In KS, the public API is defined solely via
        > documentation, so I can have code in Index call methods from Store without
        > having to expose it. With Lucene divided into multiple packages, that's a
        > problem.

        Yes detection of NFS is orthogonal and I would love to find a solution
        here. And yes Java's method/field protection is quite different from
        what KS can do.

        > > I think subclassing is perfect for this sort of situation.
        >
        > I'm not so enamored of subclassing. It's less constraining than some other
        > approaches, but it's still constraining.
        >
        > Case in point: it's not possible to provide a subclass of IndexFileDeleter
        > which exploits advisory read locking under your current proposal.
        >
        > In theory, your proposal even prevents the addition of such read locks to
        > Lucene later, because doing so could conflict with a deletions policy you've
        > allowed the user to set. (; Given that locks over NFS make you "nervous",
        > perhaps you consider foreclosing that option a feature.

        No, I would not consider foreclosing that option a feature!

        Yes, I am nervous about relying on advisory read locks 100% today in
        the Lucene core. But, I would love to be proven wrong in the future:
        if your lock based solution actually works out "perfectly" in the
        future then users can indeed fire up Lucene/KS regardless of what
        filesystem the index is on. I would equally love to see file-based
        reference counting work out, etc: if any option proves reliable enough
        in the future then we can make it the default deletion policy.

        Yes users who set their own deletion policies would not get this
        default but that's OK: such users understand what they've done.

        And, I don't want to change the default policy now ("first do no
        harm").

        > > It's like the various LockFactory implementations we have: there is no "one
        > > size fits all".
        >
        > I don't think LockFactory ought to be exposed either.
        >
        > Reading from an index – any index, on any volume – should Just Work.
        > Writing to an index from a single IndexWriter should Just Work. In a perfect
        > world, writing from multiple writers simultaneously would Just Work, too, but
        > as that's inherently impractical given Lucene's current design, opening a
        > second writer must fail. That failure should be the only visible evidence
        > that a locking mechanism even exists.
        >
        > In my view, any deviance from that ideal API due to implementation defects
        > should be handled with exceptions rather than API additions.
        >
        > In keeping with this philosophy, Lock is not publicly exposed in KinoSearch.
        > In fact, nothing about the locking mechanism is publicly exposed. So far,
        > there seem to be three bugs with the current implementation:
        >
        > * Stale NFS Filehandle exceptions.
        > * Stale lock files interfering with unattended indexing sessions. I plan
        > to mitigate this by moving to advisory write locks when possible.
        > * Multiple machines can cause index corruption when attempting to write
        > simultaneously to a shared volume. Moving the write.lock file to the
        > index directory, as enabled by lockless commits, solves this problem.
        >
        > Once advisory file locks are in place, and if they work as expected under
        > recent versions of NFS, I expect no further problems under any common, recent
        > Unix.
        >
        > With the switch to lockless commits in KinoSearch, I've able to refactor Lock
        > and eliminate all of Lock's subclasses, simplifying the KinoSearch "core".
        > "No more subclassing of Lock" was originally a line-item in my list of
        > "great stuff" about lockless commits, but I had to take it out because it
        > wasn't true for Lucene!
        >
        > With Otis signing on to your solution, it looks like momentum is gathering for
        > it. For the record, I don't think it's a catastrophic change, just suboptimal
        > and IMO not ready for addition until improved.

        I agree it would be great to reach this perfect world. It would be
        even better to get there in just one jump from where we are today.
        It's just not nearly clear to me that a locking solution (or reference
        counting, time based expiration, etc.) for NFS is or will evolve to
        that pefect solution. And I think "not alienating users" who are
        stuck on past UNIX versions is more important than "not adding any
        API". I think we are just picking a different "lesser evil".

        > I think you can do better.

        With time, I hope so too. Progress not perfection!

        Show
        Michael McCandless added a comment - Quick summary first: OK, as you said (and I agree) I think we just have a difference of opinion on what's the "lesser evil" tradeoff here. You would prefer to change the core of KinoSearch to always use advisory read locks for all indices. Whereas I prefer to leave the Lucene core untouched since things work fine in most cases today ("if it ain't broke don't fix it"), and then open up an API so for those cases (NFS) where it doesn't work, users at least have possible solutions to try. I think you also have a high confidence that the locking approach will work fine (be perfect) on the first go and will not alienate too many users, but I don't: I have had problems with locking in the past and I think most users don't have the freedom to "upgrade OS/fileserver". So I would prefer instead to open a minimal API in the core of Lucene (so users can use different deletion policies), and then try the 5 different ideas proposed so far (and more later I'm sure) as their own deletion policy, external to Lucene's core (eg in contrib). If one of them proves to work well, universally, then sometime down the road we can promote it as the default deletion policy. OK details below: > > * I think NFS support is part of Lucene's core mission. > > When I asserted that IndexFileDeleter had nothing to do with Lucene's core > mission, I meant: you don't use Lucene to build yourself an app which helps > you delete files. Well, "custom deletion policies" is in support of the core mission of "working over NFS". > > Yes there is an open question now on what to do about the confusion on using > > IndexReader vs IndexWriter. I think moving towards "use IndexWriter for > > changes, use IndexReader for reading" is the best solution here. But I > > don't see how this relates to allowing subclassing of IndexFileDeleter to > > make your own deletion policy. > > They're hard to refactor because they're both big, hence adding either code or > API commitments to them should be avoided when possible. We're in agreement > about the desirability of simplicity. We part ways in how we measure > simplicity: I give greater emphasis to simplicity of API design. > > > I disagree on this point ("no" API is better than subclassing). > > We're talking past each other. I was generalizing: a 100% successful, purely > internal "black box" solution is always better than a solution that involves > the user. OK, yes in the ideal case, no API is better than API if your situation allows for no API. I just don't think this is one of those situations: I don't think we have a clear cut "one size fits all" solution. > > I would not want to add file locking & additional complexity into the Lucene > > core, just to handle NFS. > > This is where our differing priorities manifest. I would rather add some > code complexity to the Lucene "core" than accept the increased > support burden of an expanded API. > > Ironically, though you consider supporting NFS "part of Lucene's core > mission", for the average user your proposal as it stands is not very > user-friendly. People like Chuck, Doron, and Robert will have no trouble with > it, but if you're a newcomer to Lucene and you "just want to put an index on > NFS", subclassing IndexFileDeleter will pose a challenge. Yes, but at least having the option (picking one of the deletion policies in "contrib" once we've built them out) is quite a bit better than what we have today (no option besides "you must refresh now"). I would love to have the "perfect" solution (which you are aiming for in one step), but I'll settle today for just good progress: "progress not perfection". > I also think you may be over-estimating the amount of effort it will take to > exploit advisory read locks. (The vexing problem of how to issue a warning > when index creation is attempted on an NFS volume is orthogonal to the > read-locks approach as well.) They should be easy in KS; I'll know soon enough. > However, there are some OO discipline issues which complicate applying what I > have in mind to Java Lucene. In KS, the public API is defined solely via > documentation, so I can have code in Index call methods from Store without > having to expose it. With Lucene divided into multiple packages, that's a > problem. Yes detection of NFS is orthogonal and I would love to find a solution here. And yes Java's method/field protection is quite different from what KS can do. > > I think subclassing is perfect for this sort of situation. > > I'm not so enamored of subclassing. It's less constraining than some other > approaches, but it's still constraining. > > Case in point: it's not possible to provide a subclass of IndexFileDeleter > which exploits advisory read locking under your current proposal. > > In theory, your proposal even prevents the addition of such read locks to > Lucene later, because doing so could conflict with a deletions policy you've > allowed the user to set. (; Given that locks over NFS make you "nervous", > perhaps you consider foreclosing that option a feature. No, I would not consider foreclosing that option a feature! Yes, I am nervous about relying on advisory read locks 100% today in the Lucene core. But, I would love to be proven wrong in the future: if your lock based solution actually works out "perfectly" in the future then users can indeed fire up Lucene/KS regardless of what filesystem the index is on. I would equally love to see file-based reference counting work out, etc: if any option proves reliable enough in the future then we can make it the default deletion policy. Yes users who set their own deletion policies would not get this default but that's OK: such users understand what they've done. And, I don't want to change the default policy now ("first do no harm"). > > It's like the various LockFactory implementations we have: there is no "one > > size fits all". > > I don't think LockFactory ought to be exposed either. > > Reading from an index – any index, on any volume – should Just Work. > Writing to an index from a single IndexWriter should Just Work. In a perfect > world, writing from multiple writers simultaneously would Just Work, too, but > as that's inherently impractical given Lucene's current design, opening a > second writer must fail. That failure should be the only visible evidence > that a locking mechanism even exists. > > In my view, any deviance from that ideal API due to implementation defects > should be handled with exceptions rather than API additions. > > In keeping with this philosophy, Lock is not publicly exposed in KinoSearch. > In fact, nothing about the locking mechanism is publicly exposed. So far, > there seem to be three bugs with the current implementation: > > * Stale NFS Filehandle exceptions. > * Stale lock files interfering with unattended indexing sessions. I plan > to mitigate this by moving to advisory write locks when possible. > * Multiple machines can cause index corruption when attempting to write > simultaneously to a shared volume. Moving the write.lock file to the > index directory, as enabled by lockless commits, solves this problem. > > Once advisory file locks are in place, and if they work as expected under > recent versions of NFS, I expect no further problems under any common, recent > Unix. > > With the switch to lockless commits in KinoSearch, I've able to refactor Lock > and eliminate all of Lock's subclasses, simplifying the KinoSearch "core". > "No more subclassing of Lock" was originally a line-item in my list of > "great stuff" about lockless commits, but I had to take it out because it > wasn't true for Lucene! > > With Otis signing on to your solution, it looks like momentum is gathering for > it. For the record, I don't think it's a catastrophic change, just suboptimal > and IMO not ready for addition until improved. I agree it would be great to reach this perfect world. It would be even better to get there in just one jump from where we are today. It's just not nearly clear to me that a locking solution (or reference counting, time based expiration, etc.) for NFS is or will evolve to that pefect solution. And I think "not alienating users" who are stuck on past UNIX versions is more important than "not adding any API". I think we are just picking a different "lesser evil". > I think you can do better. With time, I hope so too. Progress not perfection!
        Hide
        Doron Cohen added a comment -

        Michael McCandless wrote:

        > The solution I have in mind abstracts away all tricky details of
        > deleting files. EG something like:
        >
        > public class OnlyLastCommitDeleter extends IndexFileDeleter {
        >
        > void onInit(List commits)

        { > onCommit(commits); > }

        >
        > void onCommit(List commits) {
        > if (commits.size() > 1) {
        > for(int i=0;i<commits.size()-1;i++)

        { > deleteCommit(commits.get(i)); > }

        > }
        > }
        >
        > Ie, the sole responsibility of the IndexFileDeleter subclass (policy)
        > is to decide when to delete a commit. The rest of the details
        > (figuring out what actual files can be deleted now that a given commit
        > segments_N is deleted) are handled by the base class (with in-memory
        > ref counting).
        >

        I don't really understand this interface and so I cannot see how
        you intend to rewrite the IndexFileDeleter as you describe, but I
        agree that if this can be done it is a better solution. So I am
        okay with waiting for this approach to mature into code.

        (I would prefer the DeletionPolicy to be a
        pluggable interface and the IndexFileDeleter to be
        an internal class, so that at least we do not expose now something
        that would stand in our way in the future. But again, since I do not
        fully understand your solution maybe please bear with me if this is
        not making sense.)

        Show
        Doron Cohen added a comment - Michael McCandless wrote: > The solution I have in mind abstracts away all tricky details of > deleting files. EG something like: > > public class OnlyLastCommitDeleter extends IndexFileDeleter { > > void onInit(List commits) { > onCommit(commits); > } > > void onCommit(List commits) { > if (commits.size() > 1) { > for(int i=0;i<commits.size()-1;i++) { > deleteCommit(commits.get(i)); > } > } > } > > Ie, the sole responsibility of the IndexFileDeleter subclass (policy) > is to decide when to delete a commit. The rest of the details > (figuring out what actual files can be deleted now that a given commit > segments_N is deleted) are handled by the base class (with in-memory > ref counting). > I don't really understand this interface and so I cannot see how you intend to rewrite the IndexFileDeleter as you describe, but I agree that if this can be done it is a better solution. So I am okay with waiting for this approach to mature into code. (I would prefer the DeletionPolicy to be a pluggable interface and the IndexFileDeleter to be an internal class , so that at least we do not expose now something that would stand in our way in the future. But again, since I do not fully understand your solution maybe please bear with me if this is not making sense.)
        Hide
        Marvin Humphrey added a comment -

        I found a flaw in my plan. If the read locks are always applied, they will
        slow deletion of obsolete segments for everybody where delete-on-last-close
        currently works as intended. Right now, the files are unlinked and disappear
        as soon as the last reader holding them open goes away. With read locks, the
        unlink op wouldn't take place if a reader was open.

        I also spent a good bit of time yesterday further researching the subtleties
        of locks over NFS. Summing up: flock can work, but dot-lock files are more
        reliable.

        So, new proposal:

        Add a new public method IndexReader.aquireReadLock(String hostId), which would
        write a dot-lock file to the index directory with hostId, a pid, and an
        incrementing integer spliced into the file name. The relevant segments_N file
        name would be written to the lockfile. Calling it would only be necessary on NFS,
        and an exception would occur if the attempt to create the lockfile
        failed.

        The deletions policy would work as in my earlier proposal, and the user
        wouldn't have to grok its innards. Troubleshooting stale lockfiles
        by snooping the index directory contents would be straightforward and
        intuitive.

        We might want aquireReadLock() to automatically zap any locks associated with
        hostId for which the pid couldn't be found, or we might want to break that out
        into another method.

        Show
        Marvin Humphrey added a comment - I found a flaw in my plan. If the read locks are always applied, they will slow deletion of obsolete segments for everybody where delete-on-last-close currently works as intended. Right now, the files are unlinked and disappear as soon as the last reader holding them open goes away. With read locks, the unlink op wouldn't take place if a reader was open. I also spent a good bit of time yesterday further researching the subtleties of locks over NFS. Summing up: flock can work, but dot-lock files are more reliable. So, new proposal: Add a new public method IndexReader.aquireReadLock(String hostId), which would write a dot-lock file to the index directory with hostId, a pid, and an incrementing integer spliced into the file name. The relevant segments_N file name would be written to the lockfile. Calling it would only be necessary on NFS, and an exception would occur if the attempt to create the lockfile failed. The deletions policy would work as in my earlier proposal, and the user wouldn't have to grok its innards. Troubleshooting stale lockfiles by snooping the index directory contents would be straightforward and intuitive. We might want aquireReadLock() to automatically zap any locks associated with hostId for which the pid couldn't be found, or we might want to break that out into another method.
        Hide
        Michael McCandless added a comment -

        > I don't really understand this interface and so I cannot see how you
        > intend to rewrite the IndexFileDeleter as you describe, but I agree
        > that if this can be done it is a better solution. So I am okay with
        > waiting for this approach to mature into code.

        The deletion policy is called on creation of a writer (onInit) and
        once per commit (onCommit) and is given a List of existing commits (=
        SegmentInfos instances) in the index. The policy then decides which
        commits should be removed and IndexFileDeleter translates that request
        (using reference counting, because a given index file may still be
        reference by commits that are not yet deleted) into which specific
        files to remove.

        For example, onCommit you would typically see a List of length 2: the
        prior commit and the new one. And the default policy
        (KeepOnlyLastCommit) would at this point remove the prior one.

        Realize that the "commit on close" mode (autoCommit=false) for
        IndexWriter (that I'm doing as part of this issue) actually keeps 2
        SegmentInfos alive at any given time: first is the segments_N file in
        the index, and second is the "in memory" SegmentInfos that haven't yet
        been committed to a segments_N file. It's only on close when the
        commit takes place that the deleter then deletes the previous
        segments_N commit.

        > (I would prefer the DeletionPolicy to be a pluggable interface and
        > the IndexFileDeleter to be an internal class, so that at least we
        > do not expose now something that would stand in our way in the
        > future. But again, since I do not fully understand your solution
        > maybe please bear with me if this is not making sense.)

        Good point: I agree an interface here is cleaner. I will use an
        interface (not subclass) and make IndexFileDeleter entirely internal.
        The deletion policy doesn't need to see any details of the
        IndexFileDeleter class.

        Show
        Michael McCandless added a comment - > I don't really understand this interface and so I cannot see how you > intend to rewrite the IndexFileDeleter as you describe, but I agree > that if this can be done it is a better solution. So I am okay with > waiting for this approach to mature into code. The deletion policy is called on creation of a writer (onInit) and once per commit (onCommit) and is given a List of existing commits (= SegmentInfos instances) in the index. The policy then decides which commits should be removed and IndexFileDeleter translates that request (using reference counting, because a given index file may still be reference by commits that are not yet deleted) into which specific files to remove. For example, onCommit you would typically see a List of length 2: the prior commit and the new one. And the default policy (KeepOnlyLastCommit) would at this point remove the prior one. Realize that the "commit on close" mode (autoCommit=false) for IndexWriter (that I'm doing as part of this issue) actually keeps 2 SegmentInfos alive at any given time: first is the segments_N file in the index, and second is the "in memory" SegmentInfos that haven't yet been committed to a segments_N file. It's only on close when the commit takes place that the deleter then deletes the previous segments_N commit. > (I would prefer the DeletionPolicy to be a pluggable interface and > the IndexFileDeleter to be an internal class , so that at least we > do not expose now something that would stand in our way in the > future. But again, since I do not fully understand your solution > maybe please bear with me if this is not making sense.) Good point: I agree an interface here is cleaner. I will use an interface (not subclass) and make IndexFileDeleter entirely internal. The deletion policy doesn't need to see any details of the IndexFileDeleter class.
        Hide
        Michael McCandless added a comment -

        > I found a flaw in my plan. If the read locks are always applied,
        > they will slow deletion of obsolete segments for everybody where
        > delete-on-last-close currently works as intended. Right now, the
        > files are unlinked and disappear as soon as the last reader holding
        > them open goes away. With read locks, the unlink op wouldn't take
        > place if a reader was open.

        Ahh good point. This is why I don't want to risk changes to Lucene
        core: most of the time Lucene's "point in time" searching works
        perfectly now. It's just NFS (so far) that's problematic which is why
        I want to keep the solution "external" to Lucene by allowing custom
        deletion policies. Plus we obviously have alot of deletion policies
        to try on NFS. First do no harm.

        > I also spent a good bit of time yesterday further researching the
        > subtleties of locks over NFS. Summing up: flock can work, but
        > dot-lock files are more reliable.

        Well, dot-locks (= "create file exclusively") have problems too.
        There have been issues with bugs in at least certain Linux NFS
        clients. And Sun's Javadocs on the equivalent Java method,
        File.createNewFile, has a warning about not relying on this for
        locking:

        http://java.sun.com/j2se/1.4.2/docs/api/java/io/File.html#createNewFile()

        This warning is why we created the NativeFSLockFactory for Directory
        locking in the first place.

        > So, new proposal:
        >
        > Add a new public method IndexReader.aquireReadLock(String hostId),
        > which would write a dot-lock file to the index directory with
        > hostId, a pid, and an incrementing integer spliced into the file
        > name. The relevant segments_N file name would be written to the
        > lockfile. Calling it would only be necessary on NFS, and an
        > exception would occur if the attempt to create the lockfile failed.
        >
        > The deletions policy would work as in my earlier proposal, and the
        > user wouldn't have to grok its innards. Troubleshooting stale
        > lockfiles by snooping the index directory contents would be
        > straightforward and intuitive.
        >
        > We might want aquireReadLock() to automatically zap any locks
        > associated with hostId for which the pid couldn't be found, or we
        > might want to break that out into another method.

        OK. You could implement this in Lucene as a custom deletion policy
        once we get this commmitted (I think this is 6 proposals now for
        "deletion policy" for NFS), plus a wrapper around IndexReader.

        Show
        Michael McCandless added a comment - > I found a flaw in my plan. If the read locks are always applied, > they will slow deletion of obsolete segments for everybody where > delete-on-last-close currently works as intended. Right now, the > files are unlinked and disappear as soon as the last reader holding > them open goes away. With read locks, the unlink op wouldn't take > place if a reader was open. Ahh good point. This is why I don't want to risk changes to Lucene core: most of the time Lucene's "point in time" searching works perfectly now. It's just NFS (so far) that's problematic which is why I want to keep the solution "external" to Lucene by allowing custom deletion policies. Plus we obviously have alot of deletion policies to try on NFS. First do no harm. > I also spent a good bit of time yesterday further researching the > subtleties of locks over NFS. Summing up: flock can work, but > dot-lock files are more reliable. Well, dot-locks (= "create file exclusively") have problems too. There have been issues with bugs in at least certain Linux NFS clients. And Sun's Javadocs on the equivalent Java method, File.createNewFile, has a warning about not relying on this for locking: http://java.sun.com/j2se/1.4.2/docs/api/java/io/File.html#createNewFile( ) This warning is why we created the NativeFSLockFactory for Directory locking in the first place. > So, new proposal: > > Add a new public method IndexReader.aquireReadLock(String hostId), > which would write a dot-lock file to the index directory with > hostId, a pid, and an incrementing integer spliced into the file > name. The relevant segments_N file name would be written to the > lockfile. Calling it would only be necessary on NFS, and an > exception would occur if the attempt to create the lockfile failed. > > The deletions policy would work as in my earlier proposal, and the > user wouldn't have to grok its innards. Troubleshooting stale > lockfiles by snooping the index directory contents would be > straightforward and intuitive. > > We might want aquireReadLock() to automatically zap any locks > associated with hostId for which the pid couldn't be found, or we > might want to break that out into another method. OK. You could implement this in Lucene as a custom deletion policy once we get this commmitted (I think this is 6 proposals now for "deletion policy" for NFS), plus a wrapper around IndexReader.
        Hide
        Marvin Humphrey added a comment -

        On Jan 23, 2007, at 2:19 PM, Michael McCandless (JIRA) wrote:

        > First do no harm.

        If that was really your guiding philosophy, you would never change anything.

        > And Sun's Javadocs on the equivalent Java method, File.createNewFile, has a
        > warning about not relying on this for locking:
        >
        > http://java.sun.com/j2se/1.4.2/docs/api/java/io/File.html#createNewFile()

        That page recommends that you use FileLock instead, which maps to Fcntl on
        some systems. The FreeBSD manpage on Fcntl uses less delicate language than
        Sun in pointing out the drawbacks:

        This interface follows the completely stupid semantics of System V and
        IEEE Std 1003.1-1988 (``POSIX.1'') that require that all locks associated
        with a file for a given process are removed when any file descriptor for
        that file is closed by that process. This semantic means that applica-
        tions must be aware of any files that a subroutine library may access.

        Trying to guarantee that kind of discipline from library code severely limits
        your options.

        > This warning is why we created the NativeFSLockFactory for Directory locking
        > in the first place.

        Take a look at this bug, which explains how that warning got added.

        http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4676183

        Read the comment below – the problem with the "protocol" they warn you
        against using is with deleteOnExit(), not createNewFile(). I think you're
        better off with dot-locks.

        > OK. You could implement this in Lucene as a custom deletion policy once we
        > get this commmitted (I think this is 6 proposals now for "deletion policy"
        > for NFS), plus a wrapper around IndexReader.

        This was the response I got on the KinoSearch list:

        We do not enable NFS writes, only reads (which is why Slashdot is able to
        reliably use NFS for its heavy load . So I don't think that will work,
        if I understand you correctly.

        Lack of bulletproof support for NFS ain't gonna hold up my next release any
        longer. What a freakin' nightmare...

        Show
        Marvin Humphrey added a comment - On Jan 23, 2007, at 2:19 PM, Michael McCandless (JIRA) wrote: > First do no harm. If that was really your guiding philosophy, you would never change anything. > And Sun's Javadocs on the equivalent Java method, File.createNewFile, has a > warning about not relying on this for locking: > > http://java.sun.com/j2se/1.4.2/docs/api/java/io/File.html#createNewFile( ) That page recommends that you use FileLock instead, which maps to Fcntl on some systems. The FreeBSD manpage on Fcntl uses less delicate language than Sun in pointing out the drawbacks: This interface follows the completely stupid semantics of System V and IEEE Std 1003.1-1988 (``POSIX.1'') that require that all locks associated with a file for a given process are removed when any file descriptor for that file is closed by that process. This semantic means that applica- tions must be aware of any files that a subroutine library may access. Trying to guarantee that kind of discipline from library code severely limits your options. > This warning is why we created the NativeFSLockFactory for Directory locking > in the first place. Take a look at this bug, which explains how that warning got added. http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4676183 Read the comment below – the problem with the "protocol" they warn you against using is with deleteOnExit(), not createNewFile(). I think you're better off with dot-locks. > OK. You could implement this in Lucene as a custom deletion policy once we > get this commmitted (I think this is 6 proposals now for "deletion policy" > for NFS), plus a wrapper around IndexReader. This was the response I got on the KinoSearch list: We do not enable NFS writes, only reads (which is why Slashdot is able to reliably use NFS for its heavy load . So I don't think that will work, if I understand you correctly. Lack of bulletproof support for NFS ain't gonna hold up my next release any longer. What a freakin' nightmare...
        Hide
        Doron Cohen added a comment -

        > The deletion policy is called on creation of a writer (onInit) and
        > once per commit (onCommit) and is given a List of existing commits (=
        > SegmentInfos instances) in the index. The policy then decides which
        > commits should be removed and IndexFileDeleter translates that request
        > (using reference counting, because a given index file may still be
        > reference by commits that are not yet deleted) into which specific
        > files to remove.
        > ...

        Michael thanks for explaining this further - yes, now it makes sense to me.

        Show
        Doron Cohen added a comment - > The deletion policy is called on creation of a writer (onInit) and > once per commit (onCommit) and is given a List of existing commits (= > SegmentInfos instances) in the index. The policy then decides which > commits should be removed and IndexFileDeleter translates that request > (using reference counting, because a given index file may still be > reference by commits that are not yet deleted) into which specific > files to remove. > ... Michael thanks for explaining this further - yes, now it makes sense to me.
        Hide
        Michael McCandless added a comment -

        OK, I've attached a patch to implement "commit on close" and "custom
        deletion policies". The design is exactly what's described above.

        There are no changes to the file format.

        All tests pass and I've added additional tests for this new
        functionality.

        Summary of the external changes:

        • For "commit on close":
        • Added new IndexWriter constructors that take "autoCommit"
          boolean: if it's false, then readers will not see any actions
          done by this writer (no new segments_N is written) until
          writer.close() is called.
        • Added IndexWriter.abort() which closes the writer without
          committing, cleaning up any temp files it had added to the
          index.
        • For "custom deletion policies":
        • Created IndexDeletionPolicy interface and added constructors to
          IndexReader/IndexWriter allowing you to specify a deletion
          policy.
        • Created IndexCommitPoint interface: this is passed to the
          deletion policy to represent each commit. The policy calls the
          delete method on this interface to remove a commit.
        • Created one deletion policy (KeepOnlyLastCommitDeletionPolicy)
          and made that the default policy. (The unit test for this has
          other "interesting" policies like "delete by age since this
          commit was obsoleted" initially discussed on java-dev.)

        Summary of internal changes:

        • Created "files()" method in SegmentInfo (and changed
          SegmentReader.files() to use it).
        • Changed IndexFileDeleter to use reference counting to keep track
          of which files are deletable because no commit(s) (nor the
          in-memory SegmentInfos) reference them.

        This is a nice simplification of IndexFileDeleter: previously it
        had detailed knowledge about which files, extensions, etc., to
        look for and delete. Now it has far less of that because it
        relies entirely on SegmentInfo.files() to compute that.

        • Changed IndexReader/IndexWriter to not directly delete files and
          instead notify IndexFileDeleter when there has been a change to
          the in-memory SegmentInfos. The deleter then incref/decref's to
          determine what files can safely be deleted.

        This is also a nice simplification for the same reason as above:
        now the writers just make changes to SegmentInfo(s) without having
        to compute/track the consequences to specific index files.

        • Simplified the fix for LUCENE-702 (addIndexes corrupts index on
          disk full) to just temporarily set autoCommit=false if it's not
          already.
        • Added get/setDefaultInfoStream to IndexWriter so you could see
          things that happen during IndexWriter constructor.
        • No longer store/propogate persistent IndexFileDelter inside
          IndexReader (removed protected method get/setDeleter). This is a
          nice simplification because the deleter is now only needed briefly
          during "commit()".
        • Reworked the toplevel javadoc for IndexWriter.
        • Added try/finally to remove a partially written segments_N if we
          hit IOException when trying to write it.
        • Other small change (small refactoring, fixes to javadocs, fixed
          spelling, etc).
        Show
        Michael McCandless added a comment - OK, I've attached a patch to implement "commit on close" and "custom deletion policies". The design is exactly what's described above. There are no changes to the file format. All tests pass and I've added additional tests for this new functionality. Summary of the external changes: For "commit on close": Added new IndexWriter constructors that take "autoCommit" boolean: if it's false, then readers will not see any actions done by this writer (no new segments_N is written) until writer.close() is called. Added IndexWriter.abort() which closes the writer without committing, cleaning up any temp files it had added to the index. For "custom deletion policies": Created IndexDeletionPolicy interface and added constructors to IndexReader/IndexWriter allowing you to specify a deletion policy. Created IndexCommitPoint interface: this is passed to the deletion policy to represent each commit. The policy calls the delete method on this interface to remove a commit. Created one deletion policy (KeepOnlyLastCommitDeletionPolicy) and made that the default policy. (The unit test for this has other "interesting" policies like "delete by age since this commit was obsoleted" initially discussed on java-dev.) Summary of internal changes: Created "files()" method in SegmentInfo (and changed SegmentReader.files() to use it). Changed IndexFileDeleter to use reference counting to keep track of which files are deletable because no commit(s) (nor the in-memory SegmentInfos) reference them. This is a nice simplification of IndexFileDeleter: previously it had detailed knowledge about which files, extensions, etc., to look for and delete. Now it has far less of that because it relies entirely on SegmentInfo.files() to compute that. Changed IndexReader/IndexWriter to not directly delete files and instead notify IndexFileDeleter when there has been a change to the in-memory SegmentInfos. The deleter then incref/decref's to determine what files can safely be deleted. This is also a nice simplification for the same reason as above: now the writers just make changes to SegmentInfo(s) without having to compute/track the consequences to specific index files. Simplified the fix for LUCENE-702 (addIndexes corrupts index on disk full) to just temporarily set autoCommit=false if it's not already. Added get/setDefaultInfoStream to IndexWriter so you could see things that happen during IndexWriter constructor. No longer store/propogate persistent IndexFileDelter inside IndexReader (removed protected method get/setDeleter). This is a nice simplification because the deleter is now only needed briefly during "commit()". Reworked the toplevel javadoc for IndexWriter. Added try/finally to remove a partially written segments_N if we hit IOException when trying to write it. Other small change (small refactoring, fixes to javadocs, fixed spelling, etc).
        Michael McCandless made changes -
        Attachment LUCENE-710.patch [ 12352438 ]
        Hide
        Michael McCandless added a comment -

        Rebased the patch to the current trunk. I plan to commit this probably end of this week.

        Show
        Michael McCandless added a comment - Rebased the patch to the current trunk. I plan to commit this probably end of this week.
        Michael McCandless made changes -
        Attachment LUCENE-710.take2.patch [ 12352818 ]
        Hide
        Doron Cohen added a comment -

        Mike, patching take2 on current trunk fails for IndexFileDeleter.java.
        patching file src/java/org/apache/lucene/index/IndexFileDeleter.java
        Hunk #1 FAILED at 18.
        Also some noise in SegmentInfo.java
        patching file src/java/org/apache/lucene/index/SegmentInfo.java
        Hunk #7 succeeded at 291 (offset 3 lines).

        Show
        Doron Cohen added a comment - Mike, patching take2 on current trunk fails for IndexFileDeleter.java. patching file src/java/org/apache/lucene/index/IndexFileDeleter.java Hunk #1 FAILED at 18. Also some noise in SegmentInfo.java patching file src/java/org/apache/lucene/index/SegmentInfo.java Hunk #7 succeeded at 291 (offset 3 lines).
        Hide
        Michael McCandless added a comment -

        Woops, looks like the commit for LUCENE-825 messed up the patch. OK I updated and re-diff'd and attached take3.

        It's too bad we don't have a patch that's better integrated with svn such that if even you have a more recent svn revision checked out, applying the patch would do so back against the revision it was based on, and then svn would merge the changes committed to the trunk since then. In this case an svn update on the checkout with the diffs produced no conflicts, so if we had such a combined patch tool, it would have worked find here. I suppose the person applying the patch could first "svn update" to its base revision, apply the patch, then svn up, but that's kind of a hassle

        Show
        Michael McCandless added a comment - Woops, looks like the commit for LUCENE-825 messed up the patch. OK I updated and re-diff'd and attached take3. It's too bad we don't have a patch that's better integrated with svn such that if even you have a more recent svn revision checked out, applying the patch would do so back against the revision it was based on, and then svn would merge the changes committed to the trunk since then. In this case an svn update on the checkout with the diffs produced no conflicts, so if we had such a combined patch tool, it would have worked find here. I suppose the person applying the patch could first "svn update" to its base revision, apply the patch, then svn up, but that's kind of a hassle
        Michael McCandless made changes -
        Attachment LUCENE-710.take3.patch [ 12352965 ]
        Michael McCandless made changes -
        Resolution Fixed [ 1 ]
        Status In Progress [ 3 ] Resolved [ 5 ]
        Hide
        Doron Cohen added a comment -

        I was too slow in reviewing this, so while I was studying the new code it was committed...

        Anyhow I have a few comments and a question - I think JIRA LUCENE-710 is still the place for this discussion even though the issue is already resolved.

        The attached 710.comments.diff implements a few suggested changes.

        I like the definition and use of IndexDeletePolicy and CommitPoint - this is very flexible and clear, and would indeed allow to implement NFS suited logic. These two concepts are central to implementing such logic, and I thought their Javadocs should be enhanced (included in the attached).

        IndexFileDeleter - it is nice that this became non public and somewhat simpler. I added some internal documentation (not javadocs) in that file as I learned how it works. I think these would be useful for others diving into this code. I also modified some variable names for clarity (in the attached).

        I don't understand yet why we allow a deletion policy to delete all commits (including the most recent) - TestDeletionPolicy explains this as: "This is useful for adding to a big index w/ autoCommit =false when you know readers are not using it." - so, would I risk losing the big index should uncommited additions fail? what does one earn by this? I first thought we should prevent (exception) deleting the most recent commit, but I must be missing something - could you elaborate on this?

        checkpoints() is another - more internal - new concept in this code. At writing this I don't fully understand it. IndexWriter has its own checkpoint() method, but it also calls IndexFileDeleter.checkpoint(). IndexReader only calls IndexFileDeletion.checkpoint() - it does not have a checkpoint() itself. ...mmm... For IndexReader it makes sense since it always commits only at close(), or at explicit calls to commit(). Perhaps I understand it better now... Ok, I added some documentation for this in IndexWriter, I think it would also help others. (in the attached.)

        This issue also introduced constants for file names - hasSingleNorms (i.e. nrm) and SINGLE_NORMS_EXTENSION (.fN) were confusing/collating - so I modified .fN to PLAIN_NORMS_EXTENSION.

        This issue moved some files logic SegmentInfo. The -1/1/0 logic and especially with norms is confusing, and at least I have to re-read the code carefully each time again and again to be convinced that it is correct. It would be nice when we can get rid of some of the backward compatibility cases here. Anyhow I added some documentation and also replaced the -1/1/0 with constants, I think this makes it easier to understand.

        Regards,
        Doron

        Show
        Doron Cohen added a comment - I was too slow in reviewing this, so while I was studying the new code it was committed... Anyhow I have a few comments and a question - I think JIRA LUCENE-710 is still the place for this discussion even though the issue is already resolved. The attached 710.comments.diff implements a few suggested changes. I like the definition and use of IndexDeletePolicy and CommitPoint - this is very flexible and clear, and would indeed allow to implement NFS suited logic. These two concepts are central to implementing such logic, and I thought their Javadocs should be enhanced (included in the attached). IndexFileDeleter - it is nice that this became non public and somewhat simpler. I added some internal documentation (not javadocs) in that file as I learned how it works. I think these would be useful for others diving into this code. I also modified some variable names for clarity (in the attached). I don't understand yet why we allow a deletion policy to delete all commits (including the most recent) - TestDeletionPolicy explains this as: "This is useful for adding to a big index w/ autoCommit =false when you know readers are not using it." - so, would I risk losing the big index should uncommited additions fail? what does one earn by this? I first thought we should prevent (exception) deleting the most recent commit, but I must be missing something - could you elaborate on this? checkpoints() is another - more internal - new concept in this code. At writing this I don't fully understand it. IndexWriter has its own checkpoint() method, but it also calls IndexFileDeleter.checkpoint(). IndexReader only calls IndexFileDeletion.checkpoint() - it does not have a checkpoint() itself. ...mmm... For IndexReader it makes sense since it always commits only at close(), or at explicit calls to commit(). Perhaps I understand it better now... Ok, I added some documentation for this in IndexWriter, I think it would also help others. (in the attached.) This issue also introduced constants for file names - hasSingleNorms (i.e. nrm) and SINGLE_NORMS_EXTENSION (.fN) were confusing/collating - so I modified .fN to PLAIN_NORMS_EXTENSION. This issue moved some files logic SegmentInfo. The -1/1/0 logic and especially with norms is confusing, and at least I have to re-read the code carefully each time again and again to be convinced that it is correct. It would be nice when we can get rid of some of the backward compatibility cases here. Anyhow I added some documentation and also replaced the -1/1/0 with constants, I think this makes it easier to understand. Regards, Doron
        Doron Cohen made changes -
        Attachment 710.review.diff [ 12353363 ]
        Hide
        Michael McCandless added a comment -

        Thanks for the review Doron!

        Your added comments & improvements to the variable names are
        excellent. I especially like the new constants (YES, NO, CHECK_DIR,
        etc.) in SegmentInfo. I've tweaked your patch here and there, and
        attached a modified patch (710.review.take2.diff). If you're happy
        with that then go ahead and commit it?

        > IndexFileDeleter - it is nice that this became non public and somewhat
        > simpler.

        I especially like that this class now has very little knowledge of
        what files "belong" to an index, especially compared to before. That
        knowledge has now been consolidated under SegmentInfo which I think is
        the right place for it.

        > I don't understand yet why we allow a deletion policy to delete
        > all commits (including the most recent) - TestDeletionPolicy
        > explains this as: "This is useful for adding to a big index w/
        > autoCommit =false when you know readers are not using it." - so,
        > would I risk losing the big index should uncommited additions fail?
        > what does one earn by this? I first thought we should prevent
        > (exception) deleting the most recent commit, but I must be missing
        > something - could you elaborate on this?

        The use case I was thinking of is: say you already have a large index
        and then you need to add a bunch more docs to it. If you are not
        allowed to delete the starting commit, then, you will consume
        substantially more disk space as you are building your index because
        the large segments at the start can't be removed. This would have
        made the "autoCommit false" case unnecessarily worse than the
        "autoCommit true" case. If for a given application the developer is
        concerned about safety (losing index due to crash), then the normal
        default policy should be used.

        > This issue moved some files logic SegmentInfo. The -1/1/0 logic and
        > especially with norms is confusing, and at least I have to re-read
        > the code carefully each time again and again to be convinced that it
        > is correct. It would be nice when we can get rid of some of the
        > backward compatibility cases here. Anyhow I added some documentation
        > and also replaced the -1/1/0 with constants, I think this makes it
        > easier to understand.

        Yes the backwards compatibility code (for pre-2.1 indices) is complex.
        The good news is by the time we release this in 2.2, most indices that
        upgrade to 2.2 will be 2.1.

        Show
        Michael McCandless added a comment - Thanks for the review Doron! Your added comments & improvements to the variable names are excellent. I especially like the new constants (YES, NO, CHECK_DIR, etc.) in SegmentInfo. I've tweaked your patch here and there, and attached a modified patch (710.review.take2.diff). If you're happy with that then go ahead and commit it? > IndexFileDeleter - it is nice that this became non public and somewhat > simpler. I especially like that this class now has very little knowledge of what files "belong" to an index, especially compared to before. That knowledge has now been consolidated under SegmentInfo which I think is the right place for it. > I don't understand yet why we allow a deletion policy to delete > all commits (including the most recent) - TestDeletionPolicy > explains this as: "This is useful for adding to a big index w/ > autoCommit =false when you know readers are not using it." - so, > would I risk losing the big index should uncommited additions fail? > what does one earn by this? I first thought we should prevent > (exception) deleting the most recent commit, but I must be missing > something - could you elaborate on this? The use case I was thinking of is: say you already have a large index and then you need to add a bunch more docs to it. If you are not allowed to delete the starting commit, then, you will consume substantially more disk space as you are building your index because the large segments at the start can't be removed. This would have made the "autoCommit false" case unnecessarily worse than the "autoCommit true" case. If for a given application the developer is concerned about safety (losing index due to crash), then the normal default policy should be used. > This issue moved some files logic SegmentInfo. The -1/1/0 logic and > especially with norms is confusing, and at least I have to re-read > the code carefully each time again and again to be convinced that it > is correct. It would be nice when we can get rid of some of the > backward compatibility cases here. Anyhow I added some documentation > and also replaced the -1/1/0 with constants, I think this makes it > easier to understand. Yes the backwards compatibility code (for pre-2.1 indices) is complex. The good news is by the time we release this in 2.2, most indices that upgrade to 2.2 will be 2.1.
        Michael McCandless made changes -
        Attachment 710.review.take2.diff [ 12353369 ]
        Hide
        Doron Cohen added a comment -

        > If for a given application the developer is concerned
        > about safety (losing index due to crash), then the
        > normal default policy should be used.

        Spooky... what if onCommit() also deletes all commits?
        (Might this be a pit for users to fall in...?)

        I added warnings about this in IndexDeletionPolicy methods.

        Just commiited review.take2 + these comments.

        Show
        Doron Cohen added a comment - > If for a given application the developer is concerned > about safety (losing index due to crash), then the > normal default policy should be used. Spooky... what if onCommit() also deletes all commits? (Might this be a pit for users to fall in...?) I added warnings about this in IndexDeletionPolicy methods. Just commiited review.take2 + these comments.
        Hide
        Michael McCandless added a comment -

        > I added warnings about this in IndexDeletionPolicy methods.

        I think that's good. Thanks!

        Show
        Michael McCandless added a comment - > I added warnings about this in IndexDeletionPolicy methods. I think that's good. Thanks!
        Mark Thomas made changes -
        Workflow jira [ 12389406 ] Default workflow, editable Closed status [ 12561835 ]
        Mark Thomas made changes -
        Workflow Default workflow, editable Closed status [ 12561835 ] jira [ 12582905 ]

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Michael McCandless
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development