Lucene - Core
  1. Lucene - Core
  2. LUCENE-1877

Use NativeFSLockFactory as default for new API (direct ctors & FSDir.open)

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9
    • Component/s: general/javadocs
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      A user requested we add a note in IndexWriter alerting the availability of NativeFSLockFactory (allowing you to avoid retaining locks on abnormal jvm exit). Seems reasonable to me - we want users to be able to easily stumble upon this class. The below code looks like a good spot to add a note - could also improve whats there a bit - opening an IndexWriter does not necessarily create a lock file - that would depend on the LockFactory used.

        <p>Opening an <code>IndexWriter</code> creates a lock file for the directory in use. Trying to open
        another <code>IndexWriter</code> on the same directory will lead to a
        {@link LockObtainFailedException}. The {@link LockObtainFailedException}
        is also thrown if an IndexReader on the same directory is used to delete documents
        from the index.</p>

      Anyone remember why NativeFSLockFactory is not the default over SimpleFSLockFactory?

      1. LUCENE-1877.patch
        23 kB
        Uwe Schindler
      2. LUCENE-1877.patch
        22 kB
        Uwe Schindler
      3. LUCENE-1877.patch
        18 kB
        Uwe Schindler
      4. LUCENE-1877.patch
        5 kB
        Uwe Schindler

        Issue Links

          Activity

          Hide
          Uwe Schindler added a comment -

          For IndexWriter/IndexReader this hint is no longer needed (in Lucene 2.9), as all methods taking String/File instead of Directory are deprecated and users should create directory instances and then will automatically get to the place where the LockFactory can be supplied.

          The note should be added to FSDirectory instead.

          Show
          Uwe Schindler added a comment - For IndexWriter/IndexReader this hint is no longer needed (in Lucene 2.9), as all methods taking String/File instead of Directory are deprecated and users should create directory instances and then will automatically get to the place where the LockFactory can be supplied. The note should be added to FSDirectory instead.
          Hide
          Mark Miller added a comment - - edited

          My initial thought was also that it didn't really belong in IndexWriter - but I sold myself on the fact that IndexWriter talks about locking and offers the force unlock method - so it seems fine to me to mention how to use the optimal locking factory (and generally avoid using the force unlock at all - as an aside I just saw a guy trying to use that the other day as regular code so that they could use two IndexWriters with just commit rather than close - ugg).

          I'm not sold either way though - I'd go with whatever. My preference would really be to make it the default (though of course not for 2.9).

          Show
          Mark Miller added a comment - - edited My initial thought was also that it didn't really belong in IndexWriter - but I sold myself on the fact that IndexWriter talks about locking and offers the force unlock method - so it seems fine to me to mention how to use the optimal locking factory (and generally avoid using the force unlock at all - as an aside I just saw a guy trying to use that the other day as regular code so that they could use two IndexWriters with just commit rather than close - ugg). I'm not sold either way though - I'd go with whatever. My preference would really be to make it the default (though of course not for 2.9).
          Hide
          Michael McCandless added a comment -

          Anyone remember why NativeFSLockFactory is not the default over SimpleFSLockFactory?

          In my testing (long ago) over NFS, I actually found "native" locks didn't work as well as "simple" locks. I was also a bit nervous on how well supported "native" locks are across different OSs.

          My preference would really be to make it the default (though of course not for 2.9).

          +1, I think it's the better default.

          People who use Lucene over NFS already have to do special things (eg make a custom deletion policy), and far too many users hit the "leftover lock file" problem. We could state in the javadocs that this default will change in 3.0?

          Maybe just add one sentence in that IndexWriter locking section, referencing the discussion in NativeFSLockFactory's javadocs about not having the "leftover lock file" problem?

          Show
          Michael McCandless added a comment - Anyone remember why NativeFSLockFactory is not the default over SimpleFSLockFactory? In my testing (long ago) over NFS, I actually found "native" locks didn't work as well as "simple" locks. I was also a bit nervous on how well supported "native" locks are across different OSs. My preference would really be to make it the default (though of course not for 2.9). +1, I think it's the better default. People who use Lucene over NFS already have to do special things (eg make a custom deletion policy), and far too many users hit the "leftover lock file" problem. We could state in the javadocs that this default will change in 3.0? Maybe just add one sentence in that IndexWriter locking section, referencing the discussion in NativeFSLockFactory's javadocs about not having the "leftover lock file" problem?
          Hide
          Uwe Schindler added a comment -

          Let's do it in the following way:

          • deprecated FSDir.getDirectory() methods return the SimpleLockFactory, as it was before.
          • The new FSDir.open() methods and also the direct ctors of SimpleFSDir, MMapFSDir, NIOFSDir default to NativeLocakFactory (these ctors/methods are all new in 2.9)

          Because of this we have no BW problem.

          Show
          Uwe Schindler added a comment - Let's do it in the following way: deprecated FSDir.getDirectory() methods return the SimpleLockFactory, as it was before. The new FSDir.open() methods and also the direct ctors of SimpleFSDir, MMapFSDir, NIOFSDir default to NativeLocakFactory (these ctors/methods are all new in 2.9) Because of this we have no BW problem.
          Hide
          Marvin Humphrey added a comment -

          > Anyone remember why NativeFSLockFactory is not the default over
          > SimpleFSLockFactory?

          Wasn't it because native locking is somethings implemented with Fcntl, and
          Fcntl locking blows chunks? Especially for a library rather than an
          application.

          From the BSD manpage on Fcntl:

          This interface follows the completely stupid semantics of System V and IEEE
          Std 1003.1-1988 (``POSIX.1'') that require that all locks associated with a
          file for a given process are removed when any file descriptor for that file is
          closed by that process. This semantic means that applications must be aware
          of any files that a subroutine library may access. For example if an
          application for updating the password file locks the password file database
          while making the update, and then calls getpwname(3) to retrieve a record, the
          lock will be lost because getpwname(3) opens, reads, and closes the password
          database. The database close will release all locks that the process has
          associated with the database, even if the library routine never requested a
          lock on the database. Another minor semantic problem with this interface is
          that locks are not inherited by a child process created using the fork(2)
          function. The flock(2) interface has much more rational last close
          semantics and allows locks to be inherited by child processes. Flock(2) is
          recommended for applications that want to ensure the integrity of their locks
          when using library routines or wish to pass locks to their children...

          The lockfile may be annoying, but at least it's guaranteed safe on all non-shared
          volumes when the OS implements atomic file opening.

          Are you folks at least able to clean up orphaned lockfiles if the PID it was created
          under is no longer active?

          Show
          Marvin Humphrey added a comment - > Anyone remember why NativeFSLockFactory is not the default over > SimpleFSLockFactory? Wasn't it because native locking is somethings implemented with Fcntl, and Fcntl locking blows chunks? Especially for a library rather than an application. From the BSD manpage on Fcntl: This interface follows the completely stupid semantics of System V and IEEE Std 1003.1-1988 (``POSIX.1'') that require that all locks associated with a file for a given process are removed when any file descriptor for that file is closed by that process. This semantic means that applications must be aware of any files that a subroutine library may access. For example if an application for updating the password file locks the password file database while making the update, and then calls getpwname(3) to retrieve a record, the lock will be lost because getpwname(3) opens, reads, and closes the password database. The database close will release all locks that the process has associated with the database, even if the library routine never requested a lock on the database. Another minor semantic problem with this interface is that locks are not inherited by a child process created using the fork(2) function. The flock(2) interface has much more rational last close semantics and allows locks to be inherited by child processes. Flock(2) is recommended for applications that want to ensure the integrity of their locks when using library routines or wish to pass locks to their children... The lockfile may be annoying, but at least it's guaranteed safe on all non-shared volumes when the OS implements atomic file opening. Are you folks at least able to clean up orphaned lockfiles if the PID it was created under is no longer active?
          Hide
          Mark Miller added a comment -

          This interface follows the completely stupid semantics of System V and IEEE
          Std 1003.1-1988 (``POSIX.1'') that require that all locks associated with a
          file for a given process are removed when any file descriptor for that file is
          closed by that process. This semantic means that applications must be aware
          of any files that a subroutine library may access. For example if an
          application for updating the password file locks the password file database
          while making the update, and then calls getpwname(3) to retrieve a record, the
          lock will be lost because getpwname(3) opens, reads, and closes the password
          database. The database close will release all locks that the process has
          associated with the database, even if the library routine never requested a
          lock on the database. Another minor semantic problem with this interface is
          that locks are not inherited by a child process created using the fork(2)
          function. The flock(2) interface has much more rational last close
          semantics and allows locks to be inherited by child processes. Flock(2) is
          recommended for applications that want to ensure the integrity of their locks
          when using library routines or wish to pass locks to their children...

          I can see how this is not ideal, but I'm not seeing how any of the mentioned issues apply to our simple lock usage ...

          Show
          Mark Miller added a comment - This interface follows the completely stupid semantics of System V and IEEE Std 1003.1-1988 (``POSIX.1'') that require that all locks associated with a file for a given process are removed when any file descriptor for that file is closed by that process. This semantic means that applications must be aware of any files that a subroutine library may access. For example if an application for updating the password file locks the password file database while making the update, and then calls getpwname(3) to retrieve a record, the lock will be lost because getpwname(3) opens, reads, and closes the password database. The database close will release all locks that the process has associated with the database, even if the library routine never requested a lock on the database. Another minor semantic problem with this interface is that locks are not inherited by a child process created using the fork(2) function. The flock(2) interface has much more rational last close semantics and allows locks to be inherited by child processes. Flock(2) is recommended for applications that want to ensure the integrity of their locks when using library routines or wish to pass locks to their children... I can see how this is not ideal, but I'm not seeing how any of the mentioned issues apply to our simple lock usage ...
          Hide
          Mark Miller added a comment -

          People who use Lucene over NFS already have to do special things (eg make a custom deletion policy), and far too many users hit the "leftover lock file" problem. We could state in the javadocs that this default will change in 3.0?

          +1 from me - if it made things work out of the box with NFS, I'd vote to keep as is, but the points you mention were in my head too.

          My only worry is current users counting on this default for NFS - but if we put it in the back compat break section (a break in regards to NFS anyway), that should be sufficient warning?

          Show
          Mark Miller added a comment - People who use Lucene over NFS already have to do special things (eg make a custom deletion policy), and far too many users hit the "leftover lock file" problem. We could state in the javadocs that this default will change in 3.0? +1 from me - if it made things work out of the box with NFS, I'd vote to keep as is, but the points you mention were in my head too. My only worry is current users counting on this default for NFS - but if we put it in the back compat break section (a break in regards to NFS anyway), that should be sufficient warning?
          Hide
          Marvin Humphrey added a comment -

          > I can see how this is not ideal, but I'm not seeing how any of the
          > mentioned issues apply to our simple lock usage ...

          "Simple lock usage"?! You must have a bigger brain than me...

          As a matter of fact, I think you're right. Fcntl locks have two major
          drawbacks, and upon review I think NativeFSLockFactory avoids both of them.

          The first is that opening and closing a file releases all locks for the entire
          process. Even if you never request a lock on the second filehandle, or if you
          request a lock and the request is denied, closing the filehandle releases the
          lock on the first filehandle. But NativeFSLockFactory avoids that problem by
          keeping the HashSet of filepaths and ensuring that the same file is never
          opened twice. Furthermore, since the lockfiles are private to Lucene, you can
          assume that nobody else is going to open them and inadvertently spoil the lock.

          The second is that child processes spawned via fork() do not inherit locks
          from the parent process. If you assume that nobody's ever going to fork a
          Java process, that's not relevant. (Too bad that won't work for Lucy... we
          have to support fork().)

          I think you're probably safe with Fcntl locks on all non-shared volumes.

          Show
          Marvin Humphrey added a comment - > I can see how this is not ideal, but I'm not seeing how any of the > mentioned issues apply to our simple lock usage ... "Simple lock usage"?! You must have a bigger brain than me... As a matter of fact, I think you're right. Fcntl locks have two major drawbacks, and upon review I think NativeFSLockFactory avoids both of them. The first is that opening and closing a file releases all locks for the entire process. Even if you never request a lock on the second filehandle, or if you request a lock and the request is denied, closing the filehandle releases the lock on the first filehandle. But NativeFSLockFactory avoids that problem by keeping the HashSet of filepaths and ensuring that the same file is never opened twice. Furthermore, since the lockfiles are private to Lucene, you can assume that nobody else is going to open them and inadvertently spoil the lock. The second is that child processes spawned via fork() do not inherit locks from the parent process. If you assume that nobody's ever going to fork a Java process, that's not relevant. (Too bad that won't work for Lucy... we have to support fork().) I think you're probably safe with Fcntl locks on all non-shared volumes.
          Hide
          Mark Miller added a comment -

          My brain has never been as large as I'd like it to be, but that's
          never concerned me too greatly - it's my ego that I have the trouble
          with.

          • Mark

          http://www.lucidimagination.com (mobile)

          On Aug 30, 2009, at 11:25 PM, "Marvin Humphrey (JIRA)"

          Show
          Mark Miller added a comment - My brain has never been as large as I'd like it to be, but that's never concerned me too greatly - it's my ego that I have the trouble with. Mark http://www.lucidimagination.com (mobile) On Aug 30, 2009, at 11:25 PM, "Marvin Humphrey (JIRA)"
          Hide
          Michael McCandless added a comment -

          Let's do it in the following way:

          • deprecated FSDir.getDirectory() methods return the SimpleLockFactory, as it was before.
          • The new FSDir.open() methods and also the direct ctors of SimpleFSDir, MMapFSDir, NIOFSDir default to NativeLocakFactory (these ctors/methods are all new in 2.9)

          +1

          Show
          Michael McCandless added a comment - Let's do it in the following way: deprecated FSDir.getDirectory() methods return the SimpleLockFactory, as it was before. The new FSDir.open() methods and also the direct ctors of SimpleFSDir, MMapFSDir, NIOFSDir default to NativeLocakFactory (these ctors/methods are all new in 2.9) +1
          Hide
          Uwe Schindler added a comment -

          Here is a patch, that changes the default for ctor-based / open() based instantiations to use NativeFSLockFactory (in fact, if the supplied param to ctor is NULL). Also change javadocs.

          It also deprecates the static FSDirectory.setDisableLocks() [which we should have done already]. One should simple use a ctor/open with NoLockFactory as param to do that.

          Currently only TestIndexReader fails here on windows because of some strange lockfile-delete opeartions. Maybe the testcase must be updated. I will look into this.

          If we want to go this way, we have to put this in 2.9.

          Show
          Uwe Schindler added a comment - Here is a patch, that changes the default for ctor-based / open() based instantiations to use NativeFSLockFactory (in fact, if the supplied param to ctor is NULL). Also change javadocs. It also deprecates the static FSDirectory.setDisableLocks() [which we should have done already] . One should simple use a ctor/open with NoLockFactory as param to do that. Currently only TestIndexReader fails here on windows because of some strange lockfile-delete opeartions. Maybe the testcase must be updated. I will look into this. If we want to go this way, we have to put this in 2.9.
          Hide
          Uwe Schindler added a comment -

          I will be happy when 3.0 removes all this FSDirectory deprecated stuff. Its a hell to maintain!

          Show
          Uwe Schindler added a comment - I will be happy when 3.0 removes all this FSDirectory deprecated stuff. Its a hell to maintain!
          Hide
          Michael McCandless added a comment -

          I think we should also fix NativeLockFactory so that if the write lock is in the index dir it doesn't generate the large digest in the file name. That digest is problematic when two different machines access the same physical dir via different mount names, since that results in different lock file names.

          Show
          Michael McCandless added a comment - I think we should also fix NativeLockFactory so that if the write lock is in the index dir it doesn't generate the large digest in the file name. That digest is problematic when two different machines access the same physical dir via different mount names, since that results in different lock file names.
          Hide
          Uwe Schindler added a comment - - edited

          With the above patch some more tests are failing, mostly because of the strange lock file names. I think we should fix the tests, at least hardcode the simplelock factory, when it should be tested.

          The backwards-tests seem to pass, as they only use FSDir.getDirectory() which defaults to old standard. That's good.

          Show
          Uwe Schindler added a comment - - edited With the above patch some more tests are failing, mostly because of the strange lock file names. I think we should fix the tests, at least hardcode the simplelock factory, when it should be tested. The backwards-tests seem to pass, as they only use FSDir.getDirectory() which defaults to old standard. That's good.
          Hide
          Mark Miller added a comment -

          If we want to go this way, we have to put this in 2.9.

          I'd personally be a little (to a lot) afraid to change the default to native during freeze -

          Show
          Mark Miller added a comment - If we want to go this way, we have to put this in 2.9. I'd personally be a little (to a lot) afraid to change the default to native during freeze -
          Hide
          Uwe Schindler added a comment -

          It's only the default for new code, clearly documented; deprecated code stays as it is.

          If we will not get this into 2.9, 3.0 will remove the deprecated parts and the new code (new in 2.9) will change its defaults.

          Show
          Uwe Schindler added a comment - It's only the default for new code, clearly documented; deprecated code stays as it is. If we will not get this into 2.9, 3.0 will remove the deprecated parts and the new code (new in 2.9) will change its defaults.
          Hide
          Mark Miller added a comment -

          Oh, okay, cool - that makes me feel a little better.

          Though new users seeing it as the default now - thats not the worst situation, but I would almost prefer the change go through a dev cycle as the default.

          If others are not feeling as cautious, I wouldn't vote against.

          Show
          Mark Miller added a comment - Oh, okay, cool - that makes me feel a little better. Though new users seeing it as the default now - thats not the worst situation, but I would almost prefer the change go through a dev cycle as the default. If others are not feeling as cautious, I wouldn't vote against.
          Hide
          Uwe Schindler added a comment - - edited

          As nobody else objects, I will update the tests tomorrow and switch to NativeFSLockFactory for the new ctors and FSDir.open(). The old and deprectated API is unchanged.

          I will also remove the unneeded lock prefix and use the same lock file name as SimpleFSLockFactory. This would also help users mixing both lock factories together (by using deprecated code defaulting to Simple and new code defaulting to Native). The SimpleLockFactory would also detect a lock, if the NativeFSLockFactory created it (because file has same name). The tests will then also pass (which depended on file name).

          Will go to bed now.

          Show
          Uwe Schindler added a comment - - edited As nobody else objects, I will update the tests tomorrow and switch to NativeFSLockFactory for the new ctors and FSDir.open(). The old and deprectated API is unchanged. I will also remove the unneeded lock prefix and use the same lock file name as SimpleFSLockFactory. This would also help users mixing both lock factories together (by using deprecated code defaulting to Simple and new code defaulting to Native). The SimpleLockFactory would also detect a lock, if the NativeFSLockFactory created it (because file has same name). The tests will then also pass (which depended on file name). Will go to bed now.
          Hide
          Uwe Schindler added a comment - - edited

          I think we should also fix NativeLockFactory so that if the write lock is in the index dir it doesn't generate the large digest in the file name. That digest is problematic when two different machines access the same physical dir via different mount names, since that results in different lock file names.

          The digest problem is not easy to solve: It happens for all LockFactories if they are not automatically created (when LockFactory==null). As soon as you call FSDir.open(..., new SimpleLockFactory(...)) you also get this prefix. It does not appear, when the FSDir is created by FSDir.getDirectory(), as the init() method cleans the lockPrefix directly after setting the lockfactory (the lock factory setter sets the prefix).

          The prefix is only important, if the lock is not placed inside the index directory. The best would be that FSDir would simply return null in getLockId(), when the LockFactory uses the same path as the Directory. For that to work, the LockFactory should have a getter for the fs path.

          I will try some possibilities and post a patch.

          Show
          Uwe Schindler added a comment - - edited I think we should also fix NativeLockFactory so that if the write lock is in the index dir it doesn't generate the large digest in the file name. That digest is problematic when two different machines access the same physical dir via different mount names, since that results in different lock file names. The digest problem is not easy to solve: It happens for all LockFactories if they are not automatically created (when LockFactory==null). As soon as you call FSDir.open(..., new SimpleLockFactory(...)) you also get this prefix. It does not appear, when the FSDir is created by FSDir.getDirectory(), as the init() method cleans the lockPrefix directly after setting the lockfactory (the lock factory setter sets the prefix). The prefix is only important, if the lock is not placed inside the index directory. The best would be that FSDir would simply return null in getLockId(), when the LockFactory uses the same path as the Directory. For that to work, the LockFactory should have a getter for the fs path. I will try some possibilities and post a patch.
          Hide
          Uwe Schindler added a comment - - edited

          Here is the patch:

          • SimpleFSLockFactory and NativeFSLockFactory now have the same abstract superclass providing setLockDir and getLockDir. Using this method, it is possible for directory instances to detect, if the locks reside in the directory itsself and so a lock prefix is switched off.
          • The isLocked() bug in NativeFSLockFactory (LUCENE-1885) is solved by implementing what was described in this issue.
          • aquireTestLock in NativeFSLockFactory was removed from ctor and only called for the first makeLock() call. This prevents the LockFactory from creating the directory when not needed (e.g. opening non-existent index).

          I have one idea (which is a new feature): How about providing a ctor to NativeFSLockFactory and SimpleFSLockFactory without param. When this LF is added to a FSDir, it would default to set the LockDir to itsself (if lf.getLockDir()==null) lf.setLockDir(this.directory)). This would prevent users from always giving the directory twice? Any thoughts, I would like to have that.

          Because of the missing lockPrefix for locks inside the directory itsself one backwards test (TestLockFactory) must be changed in backwards-branch, too.

          Show
          Uwe Schindler added a comment - - edited Here is the patch: SimpleFSLockFactory and NativeFSLockFactory now have the same abstract superclass providing setLockDir and getLockDir. Using this method, it is possible for directory instances to detect, if the locks reside in the directory itsself and so a lock prefix is switched off. The isLocked() bug in NativeFSLockFactory ( LUCENE-1885 ) is solved by implementing what was described in this issue. aquireTestLock in NativeFSLockFactory was removed from ctor and only called for the first makeLock() call. This prevents the LockFactory from creating the directory when not needed (e.g. opening non-existent index). I have one idea (which is a new feature): How about providing a ctor to NativeFSLockFactory and SimpleFSLockFactory without param. When this LF is added to a FSDir, it would default to set the LockDir to itsself (if lf.getLockDir()==null) lf.setLockDir(this.directory)). This would prevent users from always giving the directory twice? Any thoughts, I would like to have that. Because of the missing lockPrefix for locks inside the directory itsself one backwards test (TestLockFactory) must be changed in backwards-branch, too.
          Hide
          Mark Miller added a comment -

          Any thoughts, I would like to have that.

          +1 - def a good idea.

          I'd kind of like to deprecate the sys property to set the lock dir as well - we have done a good job of moving away from that stuff elsewhere.

          Show
          Mark Miller added a comment - Any thoughts, I would like to have that. +1 - def a good idea. I'd kind of like to deprecate the sys property to set the lock dir as well - we have done a good job of moving away from that stuff elsewhere.
          Hide
          Uwe Schindler added a comment -

          I'd kind of like to deprecate the sys property to set the lock dir as well - we have done a good job of moving away from that stuff elsewhere.

          It is already deprecated and even not used anymore:

            /**
             * Directory specified by <code>org.apache.lucene.lockDir</code>
             * or <code>java.io.tmpdir</code> system property.
          
             * @deprecated As of 2.1, <code>LOCK_DIR</code> is unused
             * because the write.lock is now stored by default in the
             * index directory.  If you really want to store locks
             * elsewhere you can create your own {@link
             * SimpleFSLockFactory} (or {@link NativeFSLockFactory},
             * etc.) passing in your preferred lock directory.  Then,
             * pass this <code>LockFactory</code> instance to one of
             * the <code>getDirectory</code> methods that take a
             * <code>lockFactory</code> (for example, {@link #getDirectory(String, LockFactory)}).
             */
            public static final String LOCK_DIR = System.getProperty("org.apache.lucene.lockDir",
                                                                     System.getProperty("java.io.tmpdir"));
          
          Show
          Uwe Schindler added a comment - I'd kind of like to deprecate the sys property to set the lock dir as well - we have done a good job of moving away from that stuff elsewhere. It is already deprecated and even not used anymore: /** * Directory specified by <code>org.apache.lucene.lockDir</code> * or <code>java.io.tmpdir</code> system property. * @deprecated As of 2.1, <code>LOCK_DIR</code> is unused * because the write.lock is now stored by default in the * index directory. If you really want to store locks * elsewhere you can create your own {@link * SimpleFSLockFactory} (or {@link NativeFSLockFactory}, * etc.) passing in your preferred lock directory. Then, * pass this <code>LockFactory</code> instance to one of * the <code>getDirectory</code> methods that take a * <code>lockFactory</code> ( for example, {@link #getDirectory( String , LockFactory)}). */ public static final String LOCK_DIR = System .getProperty( "org.apache.lucene.lockDir" , System .getProperty( "java.io.tmpdir" ));
          Hide
          Mark Miller added a comment - - edited

          heh - my jetlag is full effect - I wasn't looking at the lockdir, I was looking at:

              String lockClassName = System.getProperty("org.apache.lucene.store.FSDirectoryLockFactoryClass");
          

          *edit

          I'm too tired to be emailing - how about deprecating this one though?

          Show
          Mark Miller added a comment - - edited heh - my jetlag is full effect - I wasn't looking at the lockdir, I was looking at: String lockClassName = System .getProperty( "org.apache.lucene.store.FSDirectoryLockFactoryClass" ); *edit I'm too tired to be emailing - how about deprecating this one though?
          Hide
          Uwe Schindler added a comment -

          It is indirectly deprecated, as it is only used, when FSDir.getDirectory() is used. In all other cases NativeFSLockFactory is used or the given one. Maybe we should add a note somewhere in javadocs. The same with the default FSDir class property (its also indirectly deprecated.

          I know the code is very ugly, but this is how it works

          Show
          Uwe Schindler added a comment - It is indirectly deprecated, as it is only used, when FSDir.getDirectory() is used. In all other cases NativeFSLockFactory is used or the given one. Maybe we should add a note somewhere in javadocs. The same with the default FSDir class property (its also indirectly deprecated. I know the code is very ugly, but this is how it works
          Hide
          Mark Miller added a comment -

          Interesting - yeah, its hard to follow it all I havn't had a chance to apply and look at your patch either.

          My main issue with it is that there a bunch of places where it says you can set the lock factory that way (not in deprecated javadoc sections). We should prob remove all those.

          Show
          Mark Miller added a comment - Interesting - yeah, its hard to follow it all I havn't had a chance to apply and look at your patch either. My main issue with it is that there a bunch of places where it says you can set the lock factory that way (not in deprecated javadoc sections). We should prob remove all those.
          Hide
          Michael McCandless added a comment -

          Patch looks good! I like the deprecation of FSDir.set/getDisableLocks and the new FSLockFactory approach.

          Maybe we should add a note somewhere in javadocs. The same with the default FSDir class property (its also indirectly deprecated

          +1, I think we should deprecate these global system properties.

          Show
          Michael McCandless added a comment - Patch looks good! I like the deprecation of FSDir.set/getDisableLocks and the new FSLockFactory approach. Maybe we should add a note somewhere in javadocs. The same with the default FSDir class property (its also indirectly deprecated +1, I think we should deprecate these global system properties.
          Hide
          Uwe Schindler added a comment - - edited

          Final patch.

          I implemented additional:

          • all FS-based lock factories use the same prefix encoding. They are now (mostly) compatible. E.g. a lock obtained with NativeFSLockFactory would also be seen as locked with SimpleFSLockFactory.
          • Added LockFactory ctors with no param. The FSDir will set the lockdir to itsself in this case.
          • Added test for LUCENE-1885
          • Added note about deprecation of all FSDir related system properties, fixed some docs

          Please test this extensively! I hope, I found all problems.

          Show
          Uwe Schindler added a comment - - edited Final patch. I implemented additional: all FS-based lock factories use the same prefix encoding. They are now (mostly) compatible. E.g. a lock obtained with NativeFSLockFactory would also be seen as locked with SimpleFSLockFactory. Added LockFactory ctors with no param. The FSDir will set the lockdir to itsself in this case. Added test for LUCENE-1885 Added note about deprecation of all FSDir related system properties, fixed some docs Please test this extensively! I hope, I found all problems.
          Hide
          Michael McCandless added a comment -

          E.g. a lock obtained with NativeFSLockFactory would also be seen as locked with SimpleFSLockFactory.

          This is neat, but I don't think we should advertise it?

          Ie, it's unsupported to mix different LockFactory impls. EG, in this case, the reverse is not true, right?

          Show
          Michael McCandless added a comment - E.g. a lock obtained with NativeFSLockFactory would also be seen as locked with SimpleFSLockFactory. This is neat, but I don't think we should advertise it? Ie, it's unsupported to mix different LockFactory impls. EG, in this case, the reverse is not true, right?
          Hide
          Uwe Schindler added a comment -

          This is neat, but I don't think we should advertise it?

          Definitely not.

          b.q. Ie, it's unsupported to mix different LockFactory impls. EG, in this case, the reverse is not true, right?

          Exactly.

          Show
          Uwe Schindler added a comment - This is neat, but I don't think we should advertise it? Definitely not. b.q. Ie, it's unsupported to mix different LockFactory impls. EG, in this case, the reverse is not true, right? Exactly.
          Hide
          Michael McCandless added a comment -

          Patch looks good! Don't forget to fix back compat tests.

          Show
          Michael McCandless added a comment - Patch looks good! Don't forget to fix back compat tests.
          Hide
          Uwe Schindler added a comment -

          Yes, I already prepared the BW test change (i simply disabled this test with the lock prefix).

          all FS-based lock factories use the same prefix encoding. They are now (mostly) compatible. E.g. a lock obtained with NativeFSLockFactory would also be seen as locked with SimpleFSLockFactory

          Ie, it's unsupported to mix different LockFactory impls.

          One important fact, because I enabled this: The deprecated methods in IndexWriter/IndexReader taking String/File args and FSDirectory.getDirectory() still use the SimpleFSLockFactory per default (or the system property). If some use mixes these deprecated calls with e.g. FSDir.open() he has still a chance to get locking work. But it is unsupported. I also added a extra note in the CHANGES.txt now, that warns because of this trap.

          Show
          Uwe Schindler added a comment - Yes, I already prepared the BW test change (i simply disabled this test with the lock prefix). all FS-based lock factories use the same prefix encoding. They are now (mostly) compatible. E.g. a lock obtained with NativeFSLockFactory would also be seen as locked with SimpleFSLockFactory Ie, it's unsupported to mix different LockFactory impls. One important fact, because I enabled this: The deprecated methods in IndexWriter/IndexReader taking String/File args and FSDirectory.getDirectory() still use the SimpleFSLockFactory per default (or the system property). If some use mixes these deprecated calls with e.g. FSDir.open() he has still a chance to get locking work. But it is unsupported. I also added a extra note in the CHANGES.txt now, that warns because of this trap.
          Hide
          Uwe Schindler added a comment -

          Formatting changes in CHANGES.txt and some minor tweaks.

          Also changed the isLocked() method for LUCENE-1885 to shortcut, if no lock file is present. In this case, without a lockfile it may also be not locked. This prevent NativeFSLock for creating the lock short time without really using it.

          It would be good, if somebody could also test this with strange file systems. I only tested Windows and Solaris

          Show
          Uwe Schindler added a comment - Formatting changes in CHANGES.txt and some minor tweaks. Also changed the isLocked() method for LUCENE-1885 to shortcut, if no lock file is present. In this case, without a lockfile it may also be not locked. This prevent NativeFSLock for creating the lock short time without really using it. It would be good, if somebody could also test this with strange file systems. I only tested Windows and Solaris
          Hide
          Uwe Schindler added a comment -

          I will commit soon!

          Show
          Uwe Schindler added a comment - I will commit soon!
          Hide
          Uwe Schindler added a comment -

          Committed revision: 811157

          Show
          Uwe Schindler added a comment - Committed revision: 811157
          Hide
          Thomas Mueller added a comment -

          FYI: other Java projects also implement exclusive locking, and automatic removal of such a file.

          Apache Jackrabbit uses FileChannel.lock() by default, but there are problems with some NFS implementations (some don't release the lock after restart, some don't support locks at all). Also, some operating systems / file systems allow multiple write locks within the same process (possibly in different class loaders). Jackrabbit works around that by (mis-)using a system property. See http://wiki.apache.org/jackrabbit/RepositoryLock

          For the H2 Database Engine I implemented a cooperative locking mechanism: http://www.h2database.com/html/advanced.html#file_locking_protocols - I have also ported that to Apache Jackrabbit ("Cooperative File Lock Mechanism"). It always works, but needs a background thread. H2 also supports a mechanism based on a server socket (open a server socket and write the IP address and port to the file) - but this is problematic if the network is misconfigured (localhost not bound to 127.0.0.1 and such) which does happen in practice.

          Show
          Thomas Mueller added a comment - FYI: other Java projects also implement exclusive locking, and automatic removal of such a file. Apache Jackrabbit uses FileChannel.lock() by default, but there are problems with some NFS implementations (some don't release the lock after restart, some don't support locks at all). Also, some operating systems / file systems allow multiple write locks within the same process (possibly in different class loaders). Jackrabbit works around that by (mis-)using a system property. See http://wiki.apache.org/jackrabbit/RepositoryLock For the H2 Database Engine I implemented a cooperative locking mechanism: http://www.h2database.com/html/advanced.html#file_locking_protocols - I have also ported that to Apache Jackrabbit ("Cooperative File Lock Mechanism"). It always works, but needs a background thread. H2 also supports a mechanism based on a server socket (open a server socket and write the IP address and port to the file) - but this is problematic if the network is misconfigured (localhost not bound to 127.0.0.1 and such) which does happen in practice.
          Hide
          Marvin Humphrey added a comment -

          > http://www.h2database.com/html/advanced.html#file_locking_protocols

          I'm a little concerned about the suitability of the polling approach for a
          low-level library like Lucene – shouldn't active code like that live in the
          application layer? Is it possible to exceed the polling interval for a low
          priority process on a system under heavy load? What happens when the app
          sleeps?

          For removing stale lock files, another technique is to incorporate the host
          name and the pid. So long as you can determine that the lock file belongs to
          your machine and that the PID is not active, you can safely zap it.

          Then tricky bit is how you get that information into the lock file. If you
          try to write that info to the lock file itself after an O_EXCL open, creating
          a fully valid lock file is no longer an atomic operation.

          The approach suggested by the creat(2) man page and endorsed in the Linux NFS
          FAQ involves hard links:

              The solution for performing atomic file locking using a lockfile
              is to create a unique file on the same file system (e.g.,
              incorporating hostname and pid), use link(2) to make a link to the
              lockfile. If link() returns 0, the lock is successful. Otherwise,
              use stat(2) on the unique file to check if its link count has
              increased to 2, in which case the lock is also successful. 
          

          This approach should also work on Windows for NTFS file systems since Windows
          2000 thanks to the CreateHardLink() function. (Samba file shares, you're out
          of luck.) However, I'm not sure about the state of support for hard links in
          Java.

          If you're interested in continuing this discussion, we should probably take it
          somewhere other than this closed issue.

          Show
          Marvin Humphrey added a comment - > http://www.h2database.com/html/advanced.html#file_locking_protocols I'm a little concerned about the suitability of the polling approach for a low-level library like Lucene – shouldn't active code like that live in the application layer? Is it possible to exceed the polling interval for a low priority process on a system under heavy load? What happens when the app sleeps? For removing stale lock files, another technique is to incorporate the host name and the pid. So long as you can determine that the lock file belongs to your machine and that the PID is not active, you can safely zap it. Then tricky bit is how you get that information into the lock file. If you try to write that info to the lock file itself after an O_EXCL open, creating a fully valid lock file is no longer an atomic operation. The approach suggested by the creat(2) man page and endorsed in the Linux NFS FAQ involves hard links: The solution for performing atomic file locking using a lockfile is to create a unique file on the same file system (e.g., incorporating hostname and pid), use link(2) to make a link to the lockfile. If link() returns 0, the lock is successful. Otherwise, use stat(2) on the unique file to check if its link count has increased to 2, in which case the lock is also successful. This approach should also work on Windows for NTFS file systems since Windows 2000 thanks to the CreateHardLink() function. (Samba file shares, you're out of luck.) However, I'm not sure about the state of support for hard links in Java. If you're interested in continuing this discussion, we should probably take it somewhere other than this closed issue.
          Hide
          Thomas Mueller added a comment -

          > take it somewhere other than this closed issue.

          Yes, where?

          > shouldn't active code like that live in the application layer?

          Why?

          > exceed the polling interval for a low priority process on a system under heavy load?

          The watchdog thread runs with high priority (see the H2 docs). It's still possible the thread isn't run at all, but highly unlikely. High priority threads are quite reliable. I wrote a MP3 player in Java (mp3transform) which I used a lot, I never heard any gaps. If the thread can be avoided, that would be great of course. I'm just trying to say that in theory, the thread is problematic, but in practice it isn't. While file locking is not a problem in theory, but in practice.

          > What happens when the app sleeps?

          Good question! Standby / hibernate are not supported. I didn't think about that. Is there a way to detect the wakeup?

          > host name and the pid

          Yes. It is not so easy to get the PID in Java, I found: http://stackoverflow.com/questions/35842/process-id-in-java "ManagementFactory.getRuntimeMXBean().getName()". What do you do if the lock was generated by another machine? I tried with using a server socket, so you need the IP address, but unfortunately, sometimes the network is not configured correctly (but maybe it's possible to detect that). Maybe the two machines can't access each other over TCP/IP.

          > hard links

          Yes, but it looks like this doesn't work always.

          Show
          Thomas Mueller added a comment - > take it somewhere other than this closed issue. Yes, where? > shouldn't active code like that live in the application layer? Why? > exceed the polling interval for a low priority process on a system under heavy load? The watchdog thread runs with high priority (see the H2 docs). It's still possible the thread isn't run at all, but highly unlikely. High priority threads are quite reliable. I wrote a MP3 player in Java (mp3transform) which I used a lot, I never heard any gaps. If the thread can be avoided, that would be great of course. I'm just trying to say that in theory, the thread is problematic, but in practice it isn't. While file locking is not a problem in theory, but in practice. > What happens when the app sleeps? Good question! Standby / hibernate are not supported. I didn't think about that. Is there a way to detect the wakeup? > host name and the pid Yes. It is not so easy to get the PID in Java, I found: http://stackoverflow.com/questions/35842/process-id-in-java "ManagementFactory.getRuntimeMXBean().getName()". What do you do if the lock was generated by another machine? I tried with using a server socket, so you need the IP address, but unfortunately, sometimes the network is not configured correctly (but maybe it's possible to detect that). Maybe the two machines can't access each other over TCP/IP. > hard links Yes, but it looks like this doesn't work always.
          Hide
          Thomas Mueller added a comment -

          > detect the wakeup / polling interval exceeded

          An obvious solution is to use System.currentTimeMillis() and compare with the expected delay. And then stop writing and throw a exception.

          Show
          Thomas Mueller added a comment - > detect the wakeup / polling interval exceeded An obvious solution is to use System.currentTimeMillis() and compare with the expected delay. And then stop writing and throw a exception.
          Hide
          Marvin Humphrey added a comment -

          >> take it somewhere other than this closed issue.
          >
          > Yes, where?

          The java-dev list: http://markmail.org/message/ivdgmxrivs3jzhfe

          Show
          Marvin Humphrey added a comment - >> take it somewhere other than this closed issue. > > Yes, where? The java-dev list: http://markmail.org/message/ivdgmxrivs3jzhfe
          Hide
          Greg Tarr added a comment -

          A previous comment read: "My only worry is current users counting on this default for NFS - but if we put it in the back compat break section (a break in regards to NFS anyway), that should be sufficient warning?"

          We use NFS with Lucene 2.9.4, and we have just noticed the change to NativeFSLockFactory in 2.9. Despite the above comment, it was missing from the 2.9.0 changes.txt. Unfortunately using it has led to index corruptions on several occasions.

          Please can everyone ensure that changes like this make the changes.txt file so that users can properly impact assess upgrades to new versions.

          Show
          Greg Tarr added a comment - A previous comment read: "My only worry is current users counting on this default for NFS - but if we put it in the back compat break section (a break in regards to NFS anyway), that should be sufficient warning?" We use NFS with Lucene 2.9.4, and we have just noticed the change to NativeFSLockFactory in 2.9. Despite the above comment, it was missing from the 2.9.0 changes.txt. Unfortunately using it has led to index corruptions on several occasions. Please can everyone ensure that changes like this make the changes.txt file so that users can properly impact assess upgrades to new versions.
          Hide
          Michael McCandless added a comment -

          Uggggggh, sorry about that Greg. Somehow this obviously very important note was lost in this issue.

          Can you describe how you use NFS and Lucene? Is there a single machine writing to the NFS dir, or more than one?

          Show
          Michael McCandless added a comment - Uggggggh, sorry about that Greg. Somehow this obviously very important note was lost in this issue. Can you describe how you use NFS and Lucene? Is there a single machine writing to the NFS dir, or more than one?
          Hide
          Greg Tarr added a comment -

          Instances of lucene run on machines with the indexes hosted remotely on a SAN with access through a fileserver. We've now changed our implementation to SimpleFSLockFactory in the hope this will lead to the write.lock files behaving properly.

          Show
          Greg Tarr added a comment - Instances of lucene run on machines with the indexes hosted remotely on a SAN with access through a fileserver. We've now changed our implementation to SimpleFSLockFactory in the hope this will lead to the write.lock files behaving properly.
          Hide
          Michael McCandless added a comment -

          But multiple machines are able to write to the same index on the SAN? (And must therefore rely on write.lock to protect the index from two writers at once).

          What corruption are you seeing...?

          Show
          Michael McCandless added a comment - But multiple machines are able to write to the same index on the SAN? (And must therefore rely on write.lock to protect the index from two writers at once). What corruption are you seeing...?
          Hide
          Greg Tarr added a comment -

          Yes, we have multiple machines being able to write to the same index on the SAN.

          Show
          Greg Tarr added a comment - Yes, we have multiple machines being able to write to the same index on the SAN.
          Hide
          Michael McCandless added a comment -

          OK. I would strongly recommend using the lock stress test (LockStressTest/LockVerifyServer) in Lucene to verify whichever locking you're trying is in fact working properly.

          Show
          Michael McCandless added a comment - OK. I would strongly recommend using the lock stress test (LockStressTest/LockVerifyServer) in Lucene to verify whichever locking you're trying is in fact working properly.
          Hide
          Nabble added a comment -

          Dear JIRA jira@apache.org,

          Please fix the embedding code of your Nabble application "[jira] Commented: (LUCENE-1877) Use NativeFSLockFactory as default for new API (direct ctors & FSDir.open)" as soon as possible because Nabble will stop supporting that code in a few days. Here is the code you should use in your HTML page:

          <a id="nabblelink" href="http://lucene.472066.n3.nabble.com/jira-Created-LUCENE-1877-Improve-IndexWriter-javadoc-on-locking-tp574303p574326.html">[jira] Commented: (LUCENE-1877) Use NativeFSLockFactory as default for new API (direct ctors & FSDir.open)</a>
          <script src="http://http://lucene.472066.n3.nabble.com/embed/f574326"></script>

          Here is the link to your application:
          http://lucene.472066.n3.nabble.com/jira-Created-LUCENE-1877-Improve-IndexWriter-javadoc-on-locking-tp574303p574326.html

          If you don't update your HTML page, the embedded application will stop working. Note that we won't send more emails about this issue and we apologize for the inconvenience.

          Sincerely,
          The Nabble team

          Show
          Nabble added a comment - Dear JIRA jira@apache.org, Please fix the embedding code of your Nabble application " [jira] Commented: ( LUCENE-1877 ) Use NativeFSLockFactory as default for new API (direct ctors & FSDir.open)" as soon as possible because Nabble will stop supporting that code in a few days. Here is the code you should use in your HTML page: <a id="nabblelink" href="http://lucene.472066.n3.nabble.com/jira-Created-LUCENE-1877-Improve-IndexWriter-javadoc-on-locking-tp574303p574326.html"> [jira] Commented: ( LUCENE-1877 ) Use NativeFSLockFactory as default for new API (direct ctors & FSDir.open)</a> <script src="http:// http://lucene.472066.n3.nabble.com/embed/f574326 "></script> Here is the link to your application: http://lucene.472066.n3.nabble.com/jira-Created-LUCENE-1877-Improve-IndexWriter-javadoc-on-locking-tp574303p574326.html If you don't update your HTML page, the embedded application will stop working. Note that we won't send more emails about this issue and we apologize for the inconvenience. Sincerely, The Nabble team

            People

            • Assignee:
              Uwe Schindler
              Reporter:
              Mark Miller
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development