Issue Details (XML | Word | Printable)

Key: LUCENE-673
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Michael McCandless
Reporter: Michael McCandless
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Lucene - Java

Exceptions when using Lucene over NFS

Created: 16/Sep/06 02:56 PM   Updated: 11/Dec/07 09:07 PM
Return to search
Component/s: Index
Affects Version/s: 2.0.0
Fix Version/s: 2.2

Time Tracking:
Not Specified

Environment: NFS server/client

Resolution Date: 11/Dec/07 09:07 PM


 Description  « Hide
I'm opening this issue to track details on the known problems with
Lucene over NFS.

The summary is: if you have one machine writing to an index stored on
an NFS mount, and other machine(s) reading (and periodically
re-opening the index) then sometimes on re-opening the index the
reader will hit a FileNotFound exception.

This has hit many users because this is a natural way to "scale up"
your searching (single writer, multiple readers) across machines. The
best current workaround (I think?) is to take the approach Solr takes
(either by actually using Solr or copying/modifying its approach) to
take snapshots of the index and then have the readers open the
snapshots instead of the "live" index being written to.

I've been working on two patches for Lucene:

  • A locking (LockFactory) implementation using native OS locks
  • Lock-less commits

(I'll open separate issues with the details for those).

I have a simple stress test where one machine is constantly adding
docs to an index over NFS, and another machine is constantly
re-opening the index searcher over NFS.

These tests have revealed new details (at least for me!) about the
root cause of our NFS problems:

  • Even when using native locks over NFS, Lucene still hits these
    exceptions!

I was surprised by this because I had always thought (assumed?)
the NFS problem was because the "simple" file-based locking was
not correct over NFS, and that switching to native OS filesystem
locking would resolve it, but it doesn't.

I can reproduce the "FileNotFound" exceptions even when using NFS
V4 (the latest NFS protocol), so this is not just a "your NFS
server is too old" issue.

  • Then, when running the same stress test with the lock-less
    changes, I don't hit any exceptions. I've tested on NFS version
    2, 3 and 4 (using the "nfsvers=N" mount option).

I think this means that in fact (as Hoss at one point suggested I
believe), the NFS problems are likely due to the cache coherence of
the NFS file system (I think the "segments" file in particular)
against the existence of the actual segment data files.

In other words, even if you lock correctly, on the reader side it will
sometimes see stale contents of the "segments" file which lead it to
try to open a now deleted segment data file.

So I think this is good news / bad news: the bad news is, native
locking doesn't fix our problems with NFS (as at least I had expected
it to). But the good news is, it looks like (still need to do more
thorough testing of this) the changes for lock-less commits do enable
Lucene to work fine over NFS.

[One quick side note in case it helps others: to get native locks
working over NFS on Ubuntu/Debian Linux 6.06, I had to "apt-get
install nfs-common" on the NFS client machines. Before I did this I
would hit "No locks available" IOExceptions on calling the "tryLock"
method. The default nfs server install on the server machine just
worked because it runs in kernel mode and it start a lockd process.]



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Marvin Humphrey added a comment - 20/Oct/06 11:51 PM
It seems that NFS doesn't support delete-upon-last-close semantics. That means that an IndexWriter can delete files out from underneath a cached IndexReader, and they're really gone, no? Stale NFS Filehandle exception.

Michael McCandless added a comment - 21/Oct/06 01:18 PM
Yes, you are absolutely correct.

The current implementation of Lucene's "point in time" searching
capability (ie, once an IndexSearcher is open, it searches the
"snapshot" of the index at that point in time, even as writer(s) are
changing the index), directly relies on specific filesystem semantics
of "deletes of still open files".

But, these semantics differ drastically across filesystems:

  • On WIN32 local filesystems you get "Access Denied" when trying to
    delete open files. Lucene catches this & retries.
  • On UNIX local filesystems, the delete succeeds but the underlying
    file is still present & usable by open file handles ("delete on
    last close") until they are closed.
  • But, on NFS, there is absolutely no support for this. NFS server
    (until version 4) is stateless and so makes no effort to let you
    continue to access deleted files.

This means, at best for NFS (with "lock-less commits" fixes – still
in progress) we can hope to reliably instantiate a reader (ie, no more
intermittent exceptions on loading the segments), but, you will not be
able to use the "point in time searching". Meaning, when running a
search, you must expect to get a "stale NFS handle" IOException, and
re-open your index when that happens.

I think, in the future, it would make sense to change how Lucene
implements "point in time searching" so that it doesn't rely on
filesystem semantics at all (which are clearly quite different in this
area) and, instead, explicitly keeps segments_N files (and the
segments they reference) in the filesystem until "it's decided" (via
some policy, eg, "keep the last N generations" or "keep past N days
worth") that they should be pruned.

Note that such an explicit implementation would also resolve a
limitation of the current "point in time searching" which is: you
can't close your searcher and re-open it at that same point in time.
If your searcher crashes, or JVM crashes, or whatever, you are forced
at that point to switch up to the current index. You don't have the
freedom to re-open the snapshot you had been using. An explicit
implementation would fix that.

The "lock-less commits" changes would make this quite straightforward
as a future change, but I'm not aiming to do that for starters –
"progress not perfection"!


Steven Parkes added a comment - 23/Oct/06 05:57 PM
This is more of an aside than anything else, but V2-3 clients do have some support for delete after close, right? The whole .nfsXXXX thing? Server doesn't really need any support, though I think some versions "include" cron cleanup of old .nfsXXXX files that never got deleted.

Yonik Seeley added a comment - 23/Oct/06 07:20 PM
> but V2-3 clients do have some support for delete after close, right? The whole .nfsXXXX thing?

I don't think that works across boxes though.
If host "a" opens a file, and host "b" deletes that file, host "a" won't end up with the .nfs file but will end up with a "Stale NFS file handle" instead.


Steven Parkes added a comment - 23/Oct/06 07:26 PM
Yeah, I think you're right. I figured I was missing something.

Michael McCandless made changes - 28/Nov/06 10:29 PM
Field Original Value New Value
Assignee Michael McCandless [ mikemccand ]
Michael McCandless added a comment - 22/Jun/07 08:40 PM
This issue is now resolved by both LUCENE-701 and LUCENE-710 being fixed.
As far as I know there are no other outstanding issues preventing Lucene from
working over NFS. Here's an excerpt from email I just sent to java-user:

As far as I know, Lucene should now work over NFS, except you will
have to make a custom deletion policy that works for your application.

Lucene had issues with NFS in three areas: locking, stale client-side
file caches and how NFS handles deletion of open files. The first two
were fixed in Lucene 2.1 with lock-less commits (LUCENE-701) and the
last one is fixed in 2.2 with the addition of "custom deletion
policies" (LUCENE-710).

For a custom deletion policy you need to implement the
org.apache.lucene.index.IndexDeletionPolicy interface in your own
class and pass an instance of that class to your IndexWriter. This
class tells IndexWriter when it's safe to delete older commits. By
default Lucene uses an instance of KeepOnlyLastCommitDeletionPolicy.

The basic idea is to implement logic that can tell when your readers
are done using an older commit in the index. For example if you know
your readers refresh themselves once per hour then your deletion
policy can safely delete any commit older than 1 hour.

But please note that while I believe NFS should work fine, this has
not been heavily tested yet. Also note that performance over NFS is
generally not great. If you do go down this route please report back
on any success or failure! Thanks.


Michael McCandless made changes - 22/Jun/07 08:40 PM
Fix Version/s 2.2 [ 12312328 ]
Resolution Fixed [ 1 ]
Status Open [ 1 ] Closed [ 6 ]
Michael McCandless added a comment - 04/Jul/07 08:12 PM
This is not quite resolved yet. In the case where you have multiple
machines that can be writers, and the writer is able to quickly jump
back and forth between them, there is at least one issue (LUCENE-948)
that prevents this from working.

Michael McCandless made changes - 04/Jul/07 08:12 PM
Status Closed [ 6 ] Reopened [ 4 ]
Resolution Fixed [ 1 ]
Michael McCandless added a comment - 30/Sep/07 11:29 AM
More updates on the status of Lucene over NFS (see details in
LUCENE-1011):
  • For the multi-writer (ie, writers on different machines) case,
    sharing an index over NFS, Lucene currently can corrupt the index.
    But the pending fix in LUCENE-1011 looks to resolve this.
  • Also in LUCENE-1011 is a set of tools to test whether locking is
    working correctly in your environment. If you are having problems
    over NFS or some other "interesting" filesystem, it's best to
    first run the LockStressTest tool to see if it's a locking
    problem.
  • SimpleFSLockFactory seems to work in cases where
    NativeFSLockFactory does not. So, from now on,
    SimpleFSLockFactory should be the first lock factory you try to
    use on NFS!

Michael McCandless made changes - 11/Dec/07 09:07 PM
Resolution Fixed [ 1 ]
Status Reopened [ 4 ] Resolved [ 5 ]