Issue Details (XML | Word | Printable)

Key: HADOOP-730
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Chris Douglas
Reporter: Owen O'Malley
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Local file system uses copy to implement rename

Created: 16/Nov/06 08:07 AM   Updated: 21/May/08 08:05 PM
Return to search
Component/s: fs
Affects Version/s: 0.8.0
Fix Version/s: 0.17.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works 730-0.patch 2008-02-09 11:52 PM Chris Douglas 2 kB
Text File Licensed for inclusion in ASF works 730-1.patch 2008-02-10 09:09 AM Chris Douglas 2 kB
Text File Licensed for inclusion in ASF works 730-2.patch 2008-02-13 11:04 PM Chris Douglas 1 kB

Resolution Date: 06/Mar/08 09:13 PM


 Description  « Hide
There is a variable LocalFileSystem.useCopyForRename that is set to true. When true, the local file system will implement rename as a copy followed by a delete. This is likely a performance problem. Is there a reason that useCopyForRename is set?

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Doug Cutting added a comment - 16/Nov/06 06:16 PM
I think the issue was that rename isn't guaranteed to work across volumes. Linux 'mv' typically works across volumes these days, by resorting to copying, but I'm not sure what Java's file rename does.

Rod Taylor added a comment - 03/Oct/07 04:11 PM - edited
Couldn't it try the renameTo() operation and if it fails then fallback to the slower copy&delete method?

if (pathToFile(src).renameTo(pathToFile(dst))) { return true; }
else { return FileUtil.copy(this, src, this, dst, true, getConf()); }

would replace

if (useCopyForRename) { return FileUtil.copy(this, src, this, dst, true, getConf()); } } else return pathToFile(src).renameTo(pathToFile(dst));


Doug Cutting added a comment - 03/Oct/07 05:37 PM
> Couldn't it try the renameTo() operation and if it fails then fallback to the slower copy&delete method?

Sounds good to me. That might provide a nice speedup too. Care to provide a patch?


Chris Douglas added a comment - 09/Feb/08 03:59 AM
This implements Rod's idea and updates LocalFileSystem to reflect its semantics.

Hadoop QA added a comment - 10/Feb/08 05:26 AM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12375163/730-0.patch
against trunk revision 619744.

@author +1. The patch does not contain any @author tags.

tests included -1. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.

javadoc +1. The javadoc tool did not generate any warning messages.

javac -1. The applied patch generated 614 javac compiler warnings (more than the trunk's current 612 warnings).

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests -1. The patch failed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1767/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1767/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1767/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1767/console

This message is automatically generated.


Chris Douglas added a comment - 10/Feb/08 09:09 AM
Replaced calls to FileSystem::isDirectory() with FileSystem::getFileStatus().isDir() to resolve the deprecation warnings. This doesn't need new unit tests because the existing ones cover the semantics for rename. The core test that failed on Hudson passes on my machine.

Hadoop QA added a comment - 10/Feb/08 10:22 AM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12375171/730-1.patch
against trunk revision 619744.

@author +1. The patch does not contain any @author tags.

tests included -1. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1768/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1768/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1768/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1768/console

This message is automatically generated.


Owen O'Malley added a comment - 11/Feb/08 07:15 PM
+1 pending a unit test that tests the behavior that was previously broken.

I also think that using if/then/else would be more readable than ?:.


Chris Douglas added a comment - 13/Feb/08 11:04 PM
The rename code was removed from the patch to LocalFileSystem, since ChecksumFileSystem's is sufficient (it was suspected that copying on rename was causing the crc to be regenerated as part of the move, but that was mistaken). Since this doesn't patch a bug, I'm moving it to 0.17. Without the bug fix, it's trivial enough that I don't think it needs a test case. I also changed the ternary operator to an if/else stmt.

Hadoop QA added a comment - 14/Feb/08 02:34 AM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12375537/730-2.patch
against trunk revision 619744.

@author +1. The patch does not contain any @author tags.

tests included -1. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1794/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1794/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1794/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1794/console

This message is automatically generated.


Tsz Wo (Nicholas), SZE added a comment - 06/Mar/08 07:48 PM
+1 codes look good

Chris Douglas added a comment - 06/Mar/08 09:13 PM
I just committed this.

Hudson added a comment - 07/Mar/08 12:11 PM