Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-971

distcp does not always remove distcp.tmp.dir

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: distcp
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Sometimes distcp leaves behind its tmpdir when the target filesystem is s3n.

        Activity

        Aaron Kimball created issue -
        Hide
        Aaron Kimball added a comment -

        This patch fixes the problem by explcitly creating the temp directory. File open operations in, e.g., hdfs, will auto-create the tmpdir. But in s3n, which expects an object with the name somename$folder$, this won't happen. As a result, the fullyDelete() call fails (silently) because the folder doesn't exist, even though there are objects with the tmpdir prefix in their object names.

        I tested this patch manually by verifying temp dir creation during a distcp to s3n, and verifying that the temp dir object was removed at the end of the transfer.

        Show
        Aaron Kimball added a comment - This patch fixes the problem by explcitly creating the temp directory. File open operations in, e.g., hdfs, will auto-create the tmpdir. But in s3n, which expects an object with the name somename $folder$ , this won't happen. As a result, the fullyDelete() call fails (silently) because the folder doesn't exist, even though there are objects with the tmpdir prefix in their object names. I tested this patch manually by verifying temp dir creation during a distcp to s3n, and verifying that the temp dir object was removed at the end of the transfer.
        Aaron Kimball made changes -
        Field Original Value New Value
        Attachment MAPREDUCE-971.patch [ 12419248 ]
        Aaron Kimball made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12419248/MAPREDUCE-971.patch
        against trunk revision 813585.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/26/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/26/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/26/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/26/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12419248/MAPREDUCE-971.patch against trunk revision 813585. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/26/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/26/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/26/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/26/console This message is automatically generated.
        Hide
        Todd Lipcon added a comment -

        +1, patch lgtm

        Show
        Todd Lipcon added a comment - +1, patch lgtm
        Hide
        Tom White added a comment -

        I've just committed this. Thanks Aaron!

        Show
        Tom White added a comment - I've just committed this. Thanks Aaron!
        Tom White made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hadoop Flags [Reviewed]
        Fix Version/s 0.21.0 [ 12314045 ]
        Resolution Fixed [ 1 ]
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #46 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/46/)
        . distcp does not always remove distcp.tmp.dir. Contributed by Aaron Kimball.

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #46 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/46/ ) . distcp does not always remove distcp.tmp.dir. Contributed by Aaron Kimball.
        Hide
        gary murry added a comment -

        It is good that this tested manually andit is appriciated that the manual test was outline here. But why was no unit test added so that the fix can be verified automatically on future builds?

        Show
        gary murry added a comment - It is good that this tested manually andit is appriciated that the manual test was outline here. But why was no unit test added so that the fix can be verified automatically on future builds?
        Hide
        Aaron Kimball added a comment -

        An automated unit test for an S3-based system would require hardcoding S3 access credentials and connecting to an S3 account (which is a for-pay resource).

        Show
        Aaron Kimball added a comment - An automated unit test for an S3-based system would require hardcoding S3 access credentials and connecting to an S3 account (which is a for-pay resource).
        Hide
        gary murry added a comment -

        Cool, thanks for the additional info.

        Show
        gary murry added a comment - Cool, thanks for the additional info.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #106 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/106/)
        . Document use of distcp when copying to s3, managing timeouts
        in particular. Contributed by Aaron Kimball

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #106 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/106/ ) . Document use of distcp when copying to s3, managing timeouts in particular. Contributed by Aaron Kimball
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #133 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/133/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #133 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/133/ )
        Tom White made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Aaron Kimball
            Reporter:
            Aaron Kimball
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development