Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-645

When disctp is used to overwrite a file, it should return immediately with an error message

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: distcp
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When disctp is triggered to copy a directory to an already existing file, it just shows a "copy failed" error message after 4 attempts without showing any useful error message. This is extremely time consuming on a large cluster and especially when the directory being copied contains several sub-directories.
      Instead, it would be an improvement if distcp could return immediately displaying a useful error message when an user attempts such an operation. (This is an unlikely situation but still a valid test case)

      1. d_645_v1.patch
        1.0 kB
        Ravi Gummadi
      2. d_645.patch
        0.9 kB
        Ravi Gummadi
      3. distcp.txt
        11 kB
        Ramya Sunil

        Activity

        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Patch Available Patch Available Open Open
        13d 3h 4m 1 Ravi Gummadi 08/Sep/09 16:01
        Open Open Patch Available Patch Available
        201d 23h 51m 2 Ravi Gummadi 08/Sep/09 16:04
        Patch Available Patch Available Resolved Resolved
        9d 15h 50m 1 Chris Douglas 18/Sep/09 07:54
        Resolved Resolved Closed Closed
        340d 14h 19m 1 Tom White 24/Aug/10 22:14
        Tom White made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Hide
        gary murry added a comment -

        Can we get a note about why no new unit tests were added? Thanks

        Show
        gary murry added a comment - Can we get a note about why no new unit tests were added? Thanks
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #49 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/49/)
        . Prevent distcp from running a job when the destination is a
        file, but the source is not. Contributed by Ravi Gummadi

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #49 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/49/ ) . Prevent distcp from running a job when the destination is a file, but the source is not. Contributed by Ravi Gummadi
        Chris Douglas made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hadoop Flags [Reviewed]
        Fix Version/s 0.21.0 [ 12314045 ]
        Resolution Fixed [ 1 ]
        Hide
        Chris Douglas added a comment -

        +1

        I committed this. Thanks, Ravi!

        Show
        Chris Douglas added a comment - +1 I committed this. Thanks, Ravi!
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12418919/d_645_v1.patch
        against trunk revision 812546.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/48/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/48/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/48/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/48/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12418919/d_645_v1.patch against trunk revision 812546. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/48/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/48/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/48/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/48/console This message is automatically generated.
        Ravi Gummadi made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Ravi Gummadi made changes -
        Attachment d_645_v1.patch [ 12418919 ]
        Hide
        Ravi Gummadi added a comment -

        Attaching patch that applies after MAPREDUCE-649 & MAPREDUCE-654 are committed.

        Please review and provide your comments.

        Show
        Ravi Gummadi added a comment - Attaching patch that applies after MAPREDUCE-649 & MAPREDUCE-654 are committed. Please review and provide your comments.
        Ravi Gummadi made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12412151/d_645.patch
        against trunk revision 808351.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        -1 patch. The patch command could not apply the patch.

        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/528/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12412151/d_645.patch against trunk revision 808351. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/528/console This message is automatically generated.
        Ravi Gummadi made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Ravi Gummadi made changes -
        Attachment d_645.patch [ 12412151 ]
        Hide
        Ravi Gummadi added a comment -

        Attaching patch that makes distcp to check if the destination is a file and is expected to be a dir and emit a meaningful error message from setup phase itself.

        Please review and provide your comments.

        Show
        Ravi Gummadi added a comment - Attaching patch that makes distcp to check if the destination is a file and is expected to be a dir and emit a meaningful error message from setup phase itself. Please review and provide your comments.
        Ravi Gummadi made changes -
        Assignee Ravi Gummadi [ ravidotg ]
        Hide
        Ravi Gummadi added a comment -

        Fine. Looks like waiting for copying of first file to finish is not a good idea(as the copying is done to temporary dir, we don't see failure until a file is renamed/moved from temporary dir to actual destination). This could take a lot of time if all the files in the source dir are big. Also MR job wouldn't fail until 4 tries of the map task are failed.

        Show
        Ravi Gummadi added a comment - Fine. Looks like waiting for copying of first file to finish is not a good idea(as the copying is done to temporary dir, we don't see failure until a file is renamed/moved from temporary dir to actual destination). This could take a lot of time if all the files in the source dir are big. Also MR job wouldn't fail until 4 tries of the map task are failed.
        Hide
        Ravi Gummadi added a comment -

        More meaningful error message is available in the syslog of map task.

        Show
        Ravi Gummadi added a comment - More meaningful error message is available in the syslog of map task.
        Owen O'Malley made changes -
        Project Hadoop Common [ 12310240 ] Hadoop Map/Reduce [ 12310941 ]
        Key HADOOP-5173 MAPREDUCE-645
        Affects Version/s 0.18.3 [ 12313494 ]
        Issue Type Improvement [ 4 ] Bug [ 1 ]
        Component/s distcp [ 12312902 ]
        Component/s tools/distcp [ 12312387 ]
        Fix Version/s 0.18.4 [ 12313628 ]
        Hide
        Tsz Wo Nicholas Sze added a comment -

        After a job is submitted, the map/reduce frame control how the job runs and how to handle the failures.

        As shown in the output (distcp.txt), you may "consider running with -i".

        Show
        Tsz Wo Nicholas Sze added a comment - After a job is submitted, the map/reduce frame control how the job runs and how to handle the failures. As shown in the output (distcp.txt), you may "consider running with -i".
        Ramya Sunil made changes -
        Field Original Value New Value
        Attachment distcp.txt [ 12399559 ]
        Hide
        Ramya Sunil added a comment - - edited

        Attaching the output message when the above test is run on a small cluster.

        Show
        Ramya Sunil added a comment - - edited Attaching the output message when the above test is run on a small cluster.
        Ramya Sunil created issue -

          People

          • Assignee:
            Ravi Gummadi
            Reporter:
            Ramya Sunil
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development