Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7300

Add directory wrapper that optionally uses hardlinks in copyFrom

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 6.1
    • Fix Version/s: 6.1, master (7.0)
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      Today we always do byte-by-byte copy in Directory#copyFrom. While this is reliable and should be the default, certain situations can be improved by using hardlinks if possible to get constant time copy on OS / FS that support such an operation. Something like this could reside in misc if it's contained enough since it requires LinkPermissions to be set and needs to detect if both directories are subclasses of FSDirectory etc.

      1. LUCENE-7300.patch
        10 kB
        Simon Willnauer
      2. LUCENE-7300.patch
        10 kB
        Simon Willnauer

        Activity

        Hide
        simonw Simon Willnauer added a comment -

        here is a patch that adds such a directory with a test

        Show
        simonw Simon Willnauer added a comment - here is a patch that adds such a directory with a test
        Hide
        dweiss Dawid Weiss added a comment -

        This works as long as files are immutable (a hardlink to the same file would still reflect changes to that file after it's created, whereas a copy is, well, a copy), perhaps it's worth clarifying somehow (in the docs)?

        Show
        dweiss Dawid Weiss added a comment - This works as long as files are immutable (a hardlink to the same file would still reflect changes to that file after it's created, whereas a copy is, well, a copy), perhaps it's worth clarifying somehow (in the docs)?
        Hide
        dweiss Dawid Weiss added a comment -
        +      assertEquals("hey man, nice shot!", indexInput.readString());
        

        https://www.youtube.com/watch?v=GEMVGHoenXM

        Show
        dweiss Dawid Weiss added a comment - + assertEquals( "hey man, nice shot!" , indexInput.readString()); https://www.youtube.com/watch?v=GEMVGHoenXM
        Show
        simonw Simon Willnauer added a comment - https://www.youtube.com/watch?v=I3yvFmi_q1M
        Hide
        simonw Simon Willnauer added a comment -

        I added some lines to the javadocs Dawid Weiss

        Show
        simonw Simon Willnauer added a comment - I added some lines to the javadocs Dawid Weiss
        Hide
        dweiss Dawid Weiss added a comment -

        Looks (and sounds) great to me.

        Show
        dweiss Dawid Weiss added a comment - Looks (and sounds) great to me.
        Hide
        rcmuir Robert Muir added a comment -

        +1

        Show
        rcmuir Robert Muir added a comment - +1
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 268da5be45e5eed570575eea6a9e85a4cdb658e7 in lucene-solr's branch refs/heads/master from Simon Willnauer
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=268da5b ]

        LUCENE-7300: Add HardLinkCopyDirectoryWrapper to speed up file copying if hardlinks are applicable

        Show
        jira-bot ASF subversion and git services added a comment - Commit 268da5be45e5eed570575eea6a9e85a4cdb658e7 in lucene-solr's branch refs/heads/master from Simon Willnauer [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=268da5b ] LUCENE-7300 : Add HardLinkCopyDirectoryWrapper to speed up file copying if hardlinks are applicable
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit a6839beb87a73bff6139df44a7b9168a498dd426 in lucene-solr's branch refs/heads/branch_6x from Simon Willnauer
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a6839be ]

        LUCENE-7300: Add HardLinkCopyDirectoryWrapper to speed up file copying if hardlinks are applicable

        Show
        jira-bot ASF subversion and git services added a comment - Commit a6839beb87a73bff6139df44a7b9168a498dd426 in lucene-solr's branch refs/heads/branch_6x from Simon Willnauer [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a6839be ] LUCENE-7300 : Add HardLinkCopyDirectoryWrapper to speed up file copying if hardlinks are applicable
        Hide
        thetaphi Uwe Schindler added a comment -

        I have seen the following test failure:

        Build: http://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/775/
        Java: 64bit/jdk-9-ea+120 -XX:+UseCompressedOops -XX:+UseSerialGC

        1 tests failed.
        FAILED: org.apache.lucene.store.TestHardLinkCopyDirectoryWrapper.testCopyHardLinks

        Error Message:
        /home/jenkins/workspace/Lucene-Solr-6.x-Linux/lucene/build/misc/test/J2/temp/lucene.store.TestHardLinkCopyDirectoryWrapper_97CB7FCC9FBCBDC6-001/tempDir-001/test -> /home/jenkins/workspace/Lucene-Solr-6.x-Linux/lucene/build/misc/test/J2/temp/lucene.store.TestHardLinkCopyDirectoryWrapper_97CB7FCC9FBCBDC6-001/tempDir-001/dir_1/foo.bar
        
        Stack Trace:
        java.nio.file.NoSuchFileException: /home/jenkins/workspace/Lucene-Solr-6.x-Linux/lucene/build/misc/test/J2/temp/lucene.store.TestHardLinkCopyDirectoryWrapper_97CB7FCC9FBCBDC6-001/tempDir-001/test -> /home/jenkins/workspace/Lucene-Solr-6.x-Linux/lucene/build/misc/test/J2/temp/lucene.store.TestHardLinkCopyDirectoryWrapper_97CB7FCC9FBCBDC6-001/tempDir-001/dir_1/foo.bar
        	at __randomizedtesting.SeedInfo.seed([97CB7FCC9FBCBDC6:769E35B963DCF416]:0)
        	at sun.nio.fs.UnixException.translateToIOException(java.base@9-ea/UnixException.java:92)
        	at sun.nio.fs.UnixException.rethrowAsIOException(java.base@9-ea/UnixException.java:111)
        	at sun.nio.fs.UnixFileSystemProvider.createLink(java.base@9-ea/UnixFileSystemProvider.java:477)
        	at org.apache.lucene.mockfile.FilterFileSystemProvider.createLink(FilterFileSystemProvider.java:233)
        	at org.apache.lucene.mockfile.FilterFileSystemProvider.createLink(FilterFileSystemProvider.java:233)
        	at org.apache.lucene.mockfile.FilterFileSystemProvider.createLink(FilterFileSystemProvider.java:233)
        	at org.apache.lucene.mockfile.FilterFileSystemProvider.createLink(FilterFileSystemProvider.java:233)
        	at org.apache.lucene.mockfile.FilterFileSystemProvider.createLink(FilterFileSystemProvider.java:233)
        	at java.nio.file.Files.createLink(java.base@9-ea/Files.java:1089)
        	at org.apache.lucene.store.TestHardLinkCopyDirectoryWrapper.testCopyHardLinks(TestHardLinkCopyDirectoryWrapper.java:55)
        	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@9-ea/Native Method)
        	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@9-ea/NativeMethodAccessorImpl.java:62)
        	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@9-ea/DelegatingMethodAccessorImpl.java:43)
        	at java.lang.reflect.Method.invoke(java.base@9-ea/Method.java:531)
        	at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
        	at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
        	at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
        	at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
        	at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
        	at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
        	at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
        	at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
        	at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
        	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
        	at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
        	at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
        	at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
        	at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
        	at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
        	at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
        	at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
        	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        	at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
        	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
        	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
        	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        	at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
        	at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
        	at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
        	at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
        	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
        	at java.lang.Thread.run(java.base@9-ea/Thread.java:843)
        
        Show
        thetaphi Uwe Schindler added a comment - I have seen the following test failure: Build: http://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/775/ Java: 64bit/jdk-9-ea+120 -XX:+UseCompressedOops -XX:+UseSerialGC 1 tests failed. FAILED: org.apache.lucene.store.TestHardLinkCopyDirectoryWrapper.testCopyHardLinks Error Message: /home/jenkins/workspace/Lucene-Solr-6.x-Linux/lucene/build/misc/test/J2/temp/lucene.store.TestHardLinkCopyDirectoryWrapper_97CB7FCC9FBCBDC6-001/tempDir-001/test -> /home/jenkins/workspace/Lucene-Solr-6.x-Linux/lucene/build/misc/test/J2/temp/lucene.store.TestHardLinkCopyDirectoryWrapper_97CB7FCC9FBCBDC6-001/tempDir-001/dir_1/foo.bar Stack Trace: java.nio.file.NoSuchFileException: /home/jenkins/workspace/Lucene-Solr-6.x-Linux/lucene/build/misc/test/J2/temp/lucene.store.TestHardLinkCopyDirectoryWrapper_97CB7FCC9FBCBDC6-001/tempDir-001/test -> /home/jenkins/workspace/Lucene-Solr-6.x-Linux/lucene/build/misc/test/J2/temp/lucene.store.TestHardLinkCopyDirectoryWrapper_97CB7FCC9FBCBDC6-001/tempDir-001/dir_1/foo.bar at __randomizedtesting.SeedInfo.seed([97CB7FCC9FBCBDC6:769E35B963DCF416]:0) at sun.nio.fs.UnixException.translateToIOException(java.base@9-ea/UnixException.java:92) at sun.nio.fs.UnixException.rethrowAsIOException(java.base@9-ea/UnixException.java:111) at sun.nio.fs.UnixFileSystemProvider.createLink(java.base@9-ea/UnixFileSystemProvider.java:477) at org.apache.lucene.mockfile.FilterFileSystemProvider.createLink(FilterFileSystemProvider.java:233) at org.apache.lucene.mockfile.FilterFileSystemProvider.createLink(FilterFileSystemProvider.java:233) at org.apache.lucene.mockfile.FilterFileSystemProvider.createLink(FilterFileSystemProvider.java:233) at org.apache.lucene.mockfile.FilterFileSystemProvider.createLink(FilterFileSystemProvider.java:233) at org.apache.lucene.mockfile.FilterFileSystemProvider.createLink(FilterFileSystemProvider.java:233) at java.nio.file.Files.createLink(java.base@9-ea/Files.java:1089) at org.apache.lucene.store.TestHardLinkCopyDirectoryWrapper.testCopyHardLinks(TestHardLinkCopyDirectoryWrapper.java:55) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@9-ea/Native Method) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@9-ea/NativeMethodAccessorImpl.java:62) at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@9-ea/DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(java.base@9-ea/Method.java:531) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) at java.lang.Thread.run(java.base@9-ea/Thread.java:843)
        Hide
        mikemccand Michael McCandless added a comment -

        It reproduces:

        {noformat
        ant test -Dtestcase=TestHardLinkCopyDirectoryWrapper -Dtests.method=testCopyHardLinks -Dtests.seed=91C212F610DC4EB5 -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.linedocsfile=/x1/jenkins/lucene-data/enwiki.random.lines.txt -Dtests.locale=it-CH -Dtests.timezone=Canada/Newfoundland -Dtests.asserts=true -Dtests.file.encoding=UTF-8 -Dtests.verbose=true

        
        
        Show
        mikemccand Michael McCandless added a comment - It reproduces: {noformat ant test -Dtestcase=TestHardLinkCopyDirectoryWrapper -Dtests.method=testCopyHardLinks -Dtests.seed=91C212F610DC4EB5 -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.linedocsfile=/x1/jenkins/lucene-data/enwiki.random.lines.txt -Dtests.locale=it-CH -Dtests.timezone=Canada/Newfoundland -Dtests.asserts=true -Dtests.file.encoding=UTF-8 -Dtests.verbose=true
        Hide
        mikemccand Michael McCandless added a comment -

        OK I see the problem ... looks like a test bug: with this seed, newFSDirectory returned an NRTCachingDirectory but the test assumes when it creates the output foo.bar that it'll write through to the file system ... I'll fix by adding a sync call to force it.

        Show
        mikemccand Michael McCandless added a comment - OK I see the problem ... looks like a test bug: with this seed, newFSDirectory returned an NRTCachingDirectory but the test assumes when it creates the output foo.bar that it'll write through to the file system ... I'll fix by adding a sync call to force it.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 979af27209a10b41857cbf6c7439472c3eca5983 in lucene-solr's branch refs/heads/branch_6x from Mike McCandless
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=979af27 ]

        LUCENE-7300: fix test bug to ensure the newly created file is in fact written through to the underlying filesystem even if NRTCachingDirectory is used

        Show
        jira-bot ASF subversion and git services added a comment - Commit 979af27209a10b41857cbf6c7439472c3eca5983 in lucene-solr's branch refs/heads/branch_6x from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=979af27 ] LUCENE-7300 : fix test bug to ensure the newly created file is in fact written through to the underlying filesystem even if NRTCachingDirectory is used
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 032247ff6e4d576f179a3db2050af6bedf9c716c in lucene-solr's branch refs/heads/master from Mike McCandless
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=032247f ]

        LUCENE-7300: fix test bug to ensure the newly created file is in fact written through to the underlying filesystem even if NRTCachingDirectory is used

        Show
        jira-bot ASF subversion and git services added a comment - Commit 032247ff6e4d576f179a3db2050af6bedf9c716c in lucene-solr's branch refs/heads/master from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=032247f ] LUCENE-7300 : fix test bug to ensure the newly created file is in fact written through to the underlying filesystem even if NRTCachingDirectory is used

          People

          • Assignee:
            simonw Simon Willnauer
            Reporter:
            simonw Simon Willnauer
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development