Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-645

When disctp is used to overwrite a file, it should return immediately with an error message

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: distcp
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When disctp is triggered to copy a directory to an already existing file, it just shows a "copy failed" error message after 4 attempts without showing any useful error message. This is extremely time consuming on a large cluster and especially when the directory being copied contains several sub-directories.
      Instead, it would be an improvement if distcp could return immediately displaying a useful error message when an user attempts such an operation. (This is an unlikely situation but still a valid test case)

      1. d_645_v1.patch
        1.0 kB
        Ravi Gummadi
      2. d_645.patch
        0.9 kB
        Ravi Gummadi
      3. distcp.txt
        11 kB
        Ramya Sunil

        Activity

        Hide
        Ramya Sunil added a comment - - edited

        Attaching the output message when the above test is run on a small cluster.

        Show
        Ramya Sunil added a comment - - edited Attaching the output message when the above test is run on a small cluster.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        After a job is submitted, the map/reduce frame control how the job runs and how to handle the failures.

        As shown in the output (distcp.txt), you may "consider running with -i".

        Show
        Tsz Wo Nicholas Sze added a comment - After a job is submitted, the map/reduce frame control how the job runs and how to handle the failures. As shown in the output (distcp.txt), you may "consider running with -i".
        Hide
        Ravi Gummadi added a comment -

        More meaningful error message is available in the syslog of map task.

        Show
        Ravi Gummadi added a comment - More meaningful error message is available in the syslog of map task.
        Hide
        Ravi Gummadi added a comment -

        Fine. Looks like waiting for copying of first file to finish is not a good idea(as the copying is done to temporary dir, we don't see failure until a file is renamed/moved from temporary dir to actual destination). This could take a lot of time if all the files in the source dir are big. Also MR job wouldn't fail until 4 tries of the map task are failed.

        Show
        Ravi Gummadi added a comment - Fine. Looks like waiting for copying of first file to finish is not a good idea(as the copying is done to temporary dir, we don't see failure until a file is renamed/moved from temporary dir to actual destination). This could take a lot of time if all the files in the source dir are big. Also MR job wouldn't fail until 4 tries of the map task are failed.
        Hide
        Ravi Gummadi added a comment -

        Attaching patch that makes distcp to check if the destination is a file and is expected to be a dir and emit a meaningful error message from setup phase itself.

        Please review and provide your comments.

        Show
        Ravi Gummadi added a comment - Attaching patch that makes distcp to check if the destination is a file and is expected to be a dir and emit a meaningful error message from setup phase itself. Please review and provide your comments.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12412151/d_645.patch
        against trunk revision 808351.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        -1 patch. The patch command could not apply the patch.

        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/528/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12412151/d_645.patch against trunk revision 808351. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/528/console This message is automatically generated.
        Hide
        Ravi Gummadi added a comment -

        Attaching patch that applies after MAPREDUCE-649 & MAPREDUCE-654 are committed.

        Please review and provide your comments.

        Show
        Ravi Gummadi added a comment - Attaching patch that applies after MAPREDUCE-649 & MAPREDUCE-654 are committed. Please review and provide your comments.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12418919/d_645_v1.patch
        against trunk revision 812546.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/48/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/48/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/48/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/48/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12418919/d_645_v1.patch against trunk revision 812546. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/48/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/48/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/48/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/48/console This message is automatically generated.
        Hide
        Chris Douglas added a comment -

        +1

        I committed this. Thanks, Ravi!

        Show
        Chris Douglas added a comment - +1 I committed this. Thanks, Ravi!
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #49 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/49/)
        . Prevent distcp from running a job when the destination is a
        file, but the source is not. Contributed by Ravi Gummadi

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #49 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/49/ ) . Prevent distcp from running a job when the destination is a file, but the source is not. Contributed by Ravi Gummadi
        Hide
        gary murry added a comment -

        Can we get a note about why no new unit tests were added? Thanks

        Show
        gary murry added a comment - Can we get a note about why no new unit tests were added? Thanks

          People

          • Assignee:
            Ravi Gummadi
            Reporter:
            Ramya Sunil
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development