Lucene - Core
  1. Lucene - Core
  2. LUCENE-6770

FSDirectory ctor should use getAbsolutePath instead of getRealPath for directory

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.2.1
    • Fix Version/s: 5.4, 6.0
    • Component/s: core/store
    • Labels:
      None
    • Environment:

      OS X, Linux

    • Lucene Fields:
      New, Patch Available

      Description

      After upgrade from 4.1 to 5.2.1 I found that one of our test failed. Appeared the guilty was FSDirectory that converts given Path to Path.getRealPath. As result the test will fail:

      Path p = Paths.get("/var/lucene_store");
      FSDirectory d = new FSDirectory(p);
      assertEquals(p.toString(), d.getDirectory().toString());

      It because /var/lucene_store is a symlink and
      Path directory =path.getRealPath();
      resolves it to /private/var/lucene_store

      I think this is bad design decision because "direcrory" isn't just internal state but is exposed in a public interface and "getDirectory()" is widely used to initialize other components.

      It should use paths.getAbsolutePath() instead.

      build and "ant test" were successful after fix.

      1. LUCENE-6770.patch
        2 kB
        Uwe Schindler
      2. LUCENE-6770.patch
        0.6 kB
        Vladimir Kuzmin

        Activity

        Hide
        Dawid Weiss added a comment -

        This is also an issue with Windows (if you use reparse points, so-called "junctions"). But if I recall, there was a valid reason for using this particular method.

        Show
        Dawid Weiss added a comment - This is also an issue with Windows (if you use reparse points, so-called "junctions"). But if I recall, there was a valid reason for using this particular method.
        Hide
        Uwe Schindler added a comment - - edited

        The reason why we have the canonic path is the following: The NativeFSLockFactory has the limitation that the underlying OS does not lock files for the same process. It only prevents other processes from using the locked file/directory. So NativeFSLockFactory internally uses a static set of "locked index directories" and during aquiring locks it first checks if the given directory is in the set. If this is true it refuses to aquire lock. Otherwise it fall backs to OS kernel in checking the lock.
        For this check with a simple set to work correctly, the path must be canonic. If this is not done, it may happen that a user opens in the same JVM an index with 2 different Path objects (which somehow point to same dir using symlink/hardlinks/junctions), causing index corrumption.
        As getting canonic path is quite expensive, we dont expand it on every try to lock (which may also break if people change links while having index open). So we do it on FSDirectory init.
        To work around the issue mentioned here, one possibility would be to save the original Path as given in Ctor and return that one getDirectory(). The canonic path would be an implementation detail.

        Show
        Uwe Schindler added a comment - - edited The reason why we have the canonic path is the following: The NativeFSLockFactory has the limitation that the underlying OS does not lock files for the same process. It only prevents other processes from using the locked file/directory. So NativeFSLockFactory internally uses a static set of "locked index directories" and during aquiring locks it first checks if the given directory is in the set. If this is true it refuses to aquire lock. Otherwise it fall backs to OS kernel in checking the lock. For this check with a simple set to work correctly, the path must be canonic. If this is not done, it may happen that a user opens in the same JVM an index with 2 different Path objects (which somehow point to same dir using symlink/hardlinks/junctions), causing index corrumption. As getting canonic path is quite expensive, we dont expand it on every try to lock (which may also break if people change links while having index open). So we do it on FSDirectory init. To work around the issue mentioned here, one possibility would be to save the original Path as given in Ctor and return that one getDirectory(). The canonic path would be an implementation detail.
        Hide
        Dawid Weiss added a comment -

        Thanks for clarifying this, Uwe. Makes sense.

        Show
        Dawid Weiss added a comment - Thanks for clarifying this, Uwe. Makes sense.
        Hide
        Vladimir Kuzmin added a comment -

        Thanks, Uwe, I see obtainLock converts directory toRealPath anyway and adds then to concurent set LOCK_HELD. Do you mean tha NIO guarantees that call toRealPath will always return the same value even if link has changed? Is it documented behavior? Yes, we could add another method, like getLockDirectory and return toRealPath saved at ctor. My concern is that this all internal LOCK_HELD optimization looks useless and may even has some unpredictable behavior. I would always delegate it to system call. obtainLock tries to create lock file on every call, calls toRealPath, throws exception if failed to add to LOCK_HELD - it doesnt look like a solid optimization. In short, I would fix FSDirectory and remove static concurent set from NativeLockFactory at all. If obtainLock really needs optimization, we would need something like tryToObtain that returns true/false, check if Path is 'real', check LOCK_HELD before trying to create file. I can miss some historical reason though.

        Show
        Vladimir Kuzmin added a comment - Thanks, Uwe, I see obtainLock converts directory toRealPath anyway and adds then to concurent set LOCK_HELD. Do you mean tha NIO guarantees that call toRealPath will always return the same value even if link has changed? Is it documented behavior? Yes, we could add another method, like getLockDirectory and return toRealPath saved at ctor. My concern is that this all internal LOCK_HELD optimization looks useless and may even has some unpredictable behavior. I would always delegate it to system call. obtainLock tries to create lock file on every call, calls toRealPath, throws exception if failed to add to LOCK_HELD - it doesnt look like a solid optimization. In short, I would fix FSDirectory and remove static concurent set from NativeLockFactory at all. If obtainLock really needs optimization, we would need something like tryToObtain that returns true/false, check if Path is 'real', check LOCK_HELD before trying to create file. I can miss some historical reason though.
        Hide
        Vladimir Kuzmin added a comment -

        Interesting, I see that in past versions Directory interface already had it LockID: http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/store/Directory.html#getLockID%28%29

        Show
        Vladimir Kuzmin added a comment - Interesting, I see that in past versions Directory interface already had it LockID: http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/store/Directory.html#getLockID%28%29
        Hide
        Uwe Schindler added a comment -

        getLockId was not related to that. It was completely unused. In earlier Lucene version this was used to create the lock file outside the index directory (e.g., in /tmp). For that it created a hash code that was included into file name. This is no longer possible, lock files are in index directory. This method was not used anymore.

        The real path is needed not for creating the lock file, it is needed for identifying the lock in the same JVM based on a unique key (so its similar but still different).

        Show
        Uwe Schindler added a comment - getLockId was not related to that. It was completely unused. In earlier Lucene version this was used to create the lock file outside the index directory (e.g., in /tmp). For that it created a hash code that was included into file name. This is no longer possible, lock files are in index directory. This method was not used anymore. The real path is needed not for creating the lock file, it is needed for identifying the lock in the same JVM based on a unique key (so its similar but still different).
        Hide
        Uwe Schindler added a comment - - edited

        My concern is that this all internal LOCK_HELD optimization looks useless and may even has some unpredictable behavior. I would always delegate it to system call.

        This is no optimization, it is a bug fix!!! If you close a file channel after non-successful locking in the same JVM it releases all locks on all other FileChannels,too (on some platforms, including Linux). This caused index corruption in some search application, because the lock was unfortunately released by this problem: other IndexWriter in same JVM tried to lock a second time and released the lock (and unfortunately all other locks) after work done. By that another process was able to write to index => BOOM

        See LUCENE-5738:

        Note this from the FileLock documentation (http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/nio/channels/FileLock.java#28 )
        "On some systems, closing a channel releases all locks held by the Java virtual machine on the underlying file regardless of whether the locks were acquired via that channel or via another channel open on the same file. It is strongly recommended that, within a program, a unique channel be used to acquire all locks on any given file."

        This bug was fixed in Lucene in 4.9 by using the HashSet (we had that long ago, too, for similar reasons, so this was a regression). The change in 5.0 is just that the canonical path is aquired on FSDirectory ctor, because LockFactories are singletons and don't know about directories. In 4.x the canonical patch was aquired when creating a lock factory instance for a specific directory.

        So having the canonic/real path in a separate variable would mimic the same behaviour like 4.x, the "state" is just hold somewhere else.

        Show
        Uwe Schindler added a comment - - edited My concern is that this all internal LOCK_HELD optimization looks useless and may even has some unpredictable behavior. I would always delegate it to system call. This is no optimization, it is a bug fix!!! If you close a file channel after non-successful locking in the same JVM it releases all locks on all other FileChannels,too (on some platforms, including Linux). This caused index corruption in some search application, because the lock was unfortunately released by this problem: other IndexWriter in same JVM tried to lock a second time and released the lock (and unfortunately all other locks) after work done. By that another process was able to write to index => BOOM See LUCENE-5738 : Note this from the FileLock documentation ( http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/nio/channels/FileLock.java#28 ) "On some systems, closing a channel releases all locks held by the Java virtual machine on the underlying file regardless of whether the locks were acquired via that channel or via another channel open on the same file. It is strongly recommended that, within a program, a unique channel be used to acquire all locks on any given file." This bug was fixed in Lucene in 4.9 by using the HashSet (we had that long ago, too, for similar reasons, so this was a regression). The change in 5.0 is just that the canonical path is aquired on FSDirectory ctor, because LockFactories are singletons and don't know about directories. In 4.x the canonical patch was aquired when creating a lock factory instance for a specific directory. So having the canonic/real path in a separate variable would mimic the same behaviour like 4.x, the "state" is just hold somewhere else.
        Hide
        Robert Muir added a comment -

        I don't think we should make any changes here. Please, we finally just got the locking working! don't mess around with it unless you understand it!!!!!!

        Show
        Robert Muir added a comment - I don't think we should make any changes here. Please, we finally just got the locking working! don't mess around with it unless you understand it!!!!!!
        Hide
        Vladimir Kuzmin added a comment -

        Uwe, it makes it clear. Can you also clarify the last thing. I see the code at NativeFSLockFactory.java:

        87 protected Lock obtainFSLock(FSDirectory dir, String lockName) throws IOException {
        88 Path lockDir = dir.getDirectory();
        89
        90 // Ensure that lockDir exists and is a directory.
        91 // note: this will fail if lockDir is a symlink
        92 Files.createDirectories(lockDir);
        93
        94 Path lockFile = lockDir.resolve(lockName);
        95
        96 try

        { 97 Files.createFile(lockFile); 98 }

        catch (IOException ignore)

        { 99 // we must create the file to have a truly canonical path. 100 // if it's already created, we don't care. if it cant be created, it will fail below. 101 }

        102
        103 // fails if the lock file does not exist
        104 final Path realPath = lockFile.toRealPath();

        So, it calls toRealPath also. Does it mean that it relies on that after FSDirectory called
        directory = path.toRealPath();

        second call

        directory.toRealPath()

        doesn't resolve it but returns some internal state? I wasn't able to find this at documentation.

        Show
        Vladimir Kuzmin added a comment - Uwe, it makes it clear. Can you also clarify the last thing. I see the code at NativeFSLockFactory.java: 87 protected Lock obtainFSLock(FSDirectory dir, String lockName) throws IOException { 88 Path lockDir = dir.getDirectory(); 89 90 // Ensure that lockDir exists and is a directory. 91 // note: this will fail if lockDir is a symlink 92 Files.createDirectories(lockDir); 93 94 Path lockFile = lockDir.resolve(lockName); 95 96 try { 97 Files.createFile(lockFile); 98 } catch (IOException ignore) { 99 // we must create the file to have a truly canonical path. 100 // if it's already created, we don't care. if it cant be created, it will fail below. 101 } 102 103 // fails if the lock file does not exist 104 final Path realPath = lockFile.toRealPath(); So, it calls toRealPath also. Does it mean that it relies on that after FSDirectory called directory = path.toRealPath(); second call directory.toRealPath() doesn't resolve it but returns some internal state? I wasn't able to find this at documentation.
        Hide
        Uwe Schindler added a comment -

        Hi,

        thansk for pointing to this. The additional lockFile.toRealPath() was added later during some more fixes in NativeFSLockFactory. So indeed the realPath call in FSDirectory may be unneccessary. But we wont fix this in a minor version. This needs a lot of good tests using symlinks.

        For now I would just add more documentation to FSDirecory's ctor/open that explicitely states: "FSDirectory resolves the given Path in its constructor to a canonical path to ensure it can correctly lock the index directory and no other process can interfere with changing possible symlinks to the index directory inbetween. If you want to use symlinks and change them dynamically, close all IndexWriters and create a new FSDirecory instance."

        Show
        Uwe Schindler added a comment - Hi, thansk for pointing to this. The additional lockFile.toRealPath() was added later during some more fixes in NativeFSLockFactory. So indeed the realPath call in FSDirectory may be unneccessary. But we wont fix this in a minor version. This needs a lot of good tests using symlinks. For now I would just add more documentation to FSDirecory's ctor/open that explicitely states: "FSDirectory resolves the given Path in its constructor to a canonical path to ensure it can correctly lock the index directory and no other process can interfere with changing possible symlinks to the index directory inbetween. If you want to use symlinks and change them dynamically, close all IndexWriters and create a new FSDirecory instance."
        Hide
        Uwe Schindler added a comment - - edited

        So, it calls toRealPath also. Does it mean that it relies on that after FSDirectory called directory = path.toRealPath(); second call directory.toRealPath() doesn't resolve it but returns some internal state? I wasn't able to find this at documentation.

        This is just a test if the previous Files.createFile() worked (as this suppresses Exceptions). If the file was already existing and was something else, the toRealPath() call fails and we can bail out.

        Show
        Uwe Schindler added a comment - - edited So, it calls toRealPath also. Does it mean that it relies on that after FSDirectory called directory = path.toRealPath(); second call directory.toRealPath() doesn't resolve it but returns some internal state? I wasn't able to find this at documentation. This is just a test if the previous Files.createFile() worked (as this suppresses Exceptions). If the file was already existing and was something else, the toRealPath() call fails and we can bail out.
        Hide
        Uwe Schindler added a comment -

        Patch with additional documentation.

        Show
        Uwe Schindler added a comment - Patch with additional documentation.
        Hide
        Vladimir Kuzmin added a comment -

        +1

        btw, I just found that in fact, this was similar problem in 4.1 too! It also converted given path to canonical, just happened that I found it only when tried to upgrade to 5.2.1

        Show
        Vladimir Kuzmin added a comment - +1 btw, I just found that in fact, this was similar problem in 4.1 too! It also converted given path to canonical, just happened that I found it only when tried to upgrade to 5.2.1
        Hide
        ASF subversion and git services added a comment -

        Commit 1702619 from Uwe Schindler in branch 'dev/trunk'
        [ https://svn.apache.org/r1702619 ]

        LUCENE-6770: Add javadocs that FSDirectory canonicalizes the path

        Show
        ASF subversion and git services added a comment - Commit 1702619 from Uwe Schindler in branch 'dev/trunk' [ https://svn.apache.org/r1702619 ] LUCENE-6770 : Add javadocs that FSDirectory canonicalizes the path
        Hide
        ASF subversion and git services added a comment -

        Commit 1702621 from Uwe Schindler in branch 'dev/trunk'
        [ https://svn.apache.org/r1702621 ]

        LUCENE-6770: Add javadocs that FSDirectory canonicalizes the path (changes entry)

        Show
        ASF subversion and git services added a comment - Commit 1702621 from Uwe Schindler in branch 'dev/trunk' [ https://svn.apache.org/r1702621 ] LUCENE-6770 : Add javadocs that FSDirectory canonicalizes the path (changes entry)
        Hide
        ASF subversion and git services added a comment -

        Commit 1702622 from Uwe Schindler in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1702622 ]

        Merged revision(s) 1702619-1702621 from lucene/dev/trunk:
        LUCENE-6770: Add javadocs that FSDirectory canonicalizes the path

        Show
        ASF subversion and git services added a comment - Commit 1702622 from Uwe Schindler in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1702622 ] Merged revision(s) 1702619-1702621 from lucene/dev/trunk: LUCENE-6770 : Add javadocs that FSDirectory canonicalizes the path
        Hide
        Uwe Schindler added a comment -

        Fixed by adding Javadocs

        Show
        Uwe Schindler added a comment - Fixed by adding Javadocs

          People

          • Assignee:
            Uwe Schindler
            Reporter:
            Vladimir Kuzmin
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development