Issue Details (XML | Word | Printable)

Key: HADOOP-2984
Type: Task Task
Status: Closed Closed
Resolution: Fixed
Priority: Blocker Blocker
Assignee: Chris Douglas
Reporter: Owen O'Malley
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Distcp should have forrest documentation

Created: 10/Mar/08 02:39 PM   Updated: 22/Aug/08 07:50 PM
Return to search
Component/s: util
Affects Version/s: None
Fix Version/s: 0.18.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works 2984-0.patch 2008-06-10 03:48 AM Chris Douglas 97 kB
Text File Licensed for inclusion in ASF works 2984-1.patch 2008-06-10 06:39 PM Chris Douglas 63 kB
Text File Licensed for inclusion in ASF works 2984-2.patch 2008-06-13 08:40 PM Chris Douglas 64 kB
Text File Licensed for inclusion in ASF works 2984-3.patch 2008-06-13 11:51 PM Chris Douglas 70 kB

Hadoop Flags: Reviewed
Resolution Date: 14/Jun/08 12:39 AM


 Description  « Hide
We really should have a page on how to use distcp.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Chris Douglas added a comment - 10/Jun/08 03:45 AM
First draft

Hadoop QA added a comment - 10/Jun/08 05:18 AM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12383727/2984-0.patch
against trunk revision 665937.

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

-1 release audit. The applied patch generated 203 release audit warnings (more than the trunk's current 201 warnings).

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2627/testReport/
Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2627/artifact/trunk/current/releaseAuditDiffWarnings.txt
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2627/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2627/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2627/console

This message is automatically generated.


Chris Douglas added a comment - 10/Jun/08 06:39 PM - edited
I changed the name of the file, and might have missed the license somehow... trying again.

Hadoop QA added a comment - 10/Jun/08 08:14 PM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12383764/2984-1.patch
against trunk revision 666056.

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

-1 release audit. The applied patch generated 202 release audit warnings (more than the trunk's current 201 warnings).

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2632/testReport/
Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2632/artifact/trunk/current/releaseAuditDiffWarnings.txt
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2632/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2632/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2632/console

This message is automatically generated.


Tsz Wo (Nicholas), SZE added a comment - 11/Jun/08 10:30 PM - edited
  • Provide the full class name of DispCp in the beginning of the document.
  • Add a line to explain the name distcp
  • Some examples have long-single-line-commands. These long-single-line-commands got wrapped up in the document, especially in the pdf. It is good to use \ to break the long command into several lines. e.g.
    Use
    hadoop distcp               \
        hdfs://nn1:8020/foo/a   \
        hdfs://nn1:8020/foo/b   \
        hdfs://nn2:8020/bar/foo
    

    instead of

    hadoop distcp hdfs://nn1:8020/foo/a hdfs://nn1:8020/foo/b hdfs://nn2:8020/bar/foo
    
  • I think it is more clear to add a command prompt for shell commands
    e.g.
    bash$ hadoop distcp ...
    

Tsz Wo (Nicholas), SZE added a comment - 11/Jun/08 11:13 PM
BTW, there are some unrelated changes of docs/hadoop-default.html in the patch.

Chris Douglas added a comment - 13/Jun/08 08:40 PM

Provide the full class name of DispCp in the beginning of the document.

Since this is a guide to users of distcp and not developers, I left this out.

Add a line to explain the name distcp

Some examples have long-single-line-commands. These long-single-line-commands got wrapped up in the document, especially in the pdf. It is good to use \ to break the long command into several lines

I think it is more clear to add a command prompt for shell commands

Good ideas; done.

Thanks for the review


Koji Noguchi added a comment - 13/Jun/08 11:01 PM
+1.

It's also worth noting that if another client is still writing to a source file, the copy will likely fail.

Maybe also mention,

  • if any source files are deleted after distcp has started, mappers would fail (with file not found).
  • If speculative execution is turned on as 'final', behavior of distcp is undefined.

It's worth giving some examples of -update and -overwrite.

I always had trouble with these options.
Could you show how the target directory structures look like after the distcp?
(with and without -update/overwrite option)


Chris Douglas added a comment - 13/Jun/08 11:51 PM
Incorporated Koji's feedback

Chris Douglas added a comment - 14/Jun/08 12:39 AM
I just committed this.

Hudson added a comment - 16/Jun/08 10:38 PM