Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      If I have an empty directory /testdir1 and then I run the command bin/hadoop distcp /testdir1 /testdir2, the command completes successfully, but does not create the empty directory /testdir2.

      1. HADOOP-5762.2.patch
        3 kB
        Rodrigo Schmidt
      2. HADOOP-5762.patch
        3 kB
        Rodrigo Schmidt

        Issue Links

          Activity

          Hide
          Rodrigo Schmidt added a comment - - edited

          The problem happens if any of the source paths is an empty directory. It's because source directories are not "visited" in the inner loop that checks for files and directories to be copied and, thus, are not added to _distcp_src_files.

          Show
          Rodrigo Schmidt added a comment - - edited The problem happens if any of the source paths is an empty directory. It's because source directories are not "visited" in the inner loop that checks for files and directories to be copied and, thus, are not added to _distcp_src_files.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Hi Rodrigo, which version is your patch for? Does this problem still exist on trunk?

          Show
          Tsz Wo Nicholas Sze added a comment - Hi Rodrigo, which version is your patch for? Does this problem still exist on trunk?
          Hide
          Rodrigo Schmidt added a comment -

          The problem still exists on trunk.

          The patch was generated based on trunk.

          Show
          Rodrigo Schmidt added a comment - The problem still exists on trunk. The patch was generated based on trunk.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12407581/HADOOP-5762.patch
          against trunk revision 772960.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/313/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12407581/HADOOP-5762.patch against trunk revision 772960. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/313/console This message is automatically generated.
          Hide
          Rodrigo Schmidt added a comment -

          Just realized I had uploaded the wrong patch. I'm correcting it now.

          Show
          Rodrigo Schmidt added a comment - Just realized I had uploaded the wrong patch. I'm correcting it now.
          Hide
          Rodrigo Schmidt added a comment -

          Correct patch.

          Show
          Rodrigo Schmidt added a comment - Correct patch.
          Hide
          Rodrigo Schmidt added a comment -

          The root directory was not being included in the list of files/directories to be copied.

          Besides, copying was aborted if no data was being copied (only directories or empty files). The correct test should be on the number of files being copied.

          Show
          Rodrigo Schmidt added a comment - The root directory was not being included in the list of files/directories to be copied. Besides, copying was aborted if no data was being copied (only directories or empty files). The correct test should be on the number of files being copied.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12407713/HADOOP-5762.patch
          against trunk revision 772960.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/319/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/319/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/319/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/319/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12407713/HADOOP-5762.patch against trunk revision 772960. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/319/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/319/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/319/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/319/console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -
          +    return srcCount > 0;
          

          Checking srcCount may not work well since srcCount counts all src paths. It will be >0 even there is nothing to copy.

          We probably need a new varible dirCount and then return fileCount > 0 || dirCount > 0.

          Show
          Tsz Wo Nicholas Sze added a comment - + return srcCount > 0; Checking srcCount may not work well since srcCount counts all src paths. It will be >0 even there is nothing to copy. We probably need a new varible dirCount and then return fileCount > 0 || dirCount > 0.
          Hide
          Rodrigo Schmidt added a comment -

          You are right, Nicholas. What about using fileCount for that and just add empty directories to it?

          Show
          Rodrigo Schmidt added a comment - You are right, Nicholas. What about using fileCount for that and just add empty directories to it?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > What about using fileCount for that and just add empty directories to it?
          fileCount is also used for the -filelimit option which does not include dirs.

          For creating directories, we don't really need a job. Why don't we make these directories during setup? i.e. mkdir instead of appending them to src_writer/dst_writer. What do you think?

          Show
          Tsz Wo Nicholas Sze added a comment - > What about using fileCount for that and just add empty directories to it? fileCount is also used for the -filelimit option which does not include dirs. For creating directories, we don't really need a job. Why don't we make these directories during setup? i.e. mkdir instead of appending them to src_writer/dst_writer. What do you think?
          Hide
          Rodrigo Schmidt added a comment -

          It could be done. Actually, my first attempt to fix this bug did exactly that. Then I noticed that method copy() was already programmed to copy empty directories and, except for the root source, directories were being added to src_writer on method setup. I think it's a good design option to let copy be the only method that actually writes something to the destination. This makes the code simpler and more elegant (thus, more maintainable).

          I'm don't like the idea of creating only the root src directory (in case it's empty) on setup(). It just doesn't look good. And creating all empty src subdirectories on setup might force us to do a lot of extra checking on setup to cope with failures (checks that already exist on copy()). I don't think many people will be using distcp to copy empty directories and it doesn't look like the performance gain will compensate the loss in code simplicity.

          Show
          Rodrigo Schmidt added a comment - It could be done. Actually, my first attempt to fix this bug did exactly that. Then I noticed that method copy() was already programmed to copy empty directories and, except for the root source, directories were being added to src_writer on method setup. I think it's a good design option to let copy be the only method that actually writes something to the destination. This makes the code simpler and more elegant (thus, more maintainable). I'm don't like the idea of creating only the root src directory (in case it's empty) on setup(). It just doesn't look good. And creating all empty src subdirectories on setup might force us to do a lot of extra checking on setup to cope with failures (checks that already exist on copy()). I don't think many people will be using distcp to copy empty directories and it doesn't look like the performance gain will compensate the loss in code simplicity.
          Hide
          Rodrigo Schmidt added a comment -

          Uploading a new patch.

          Show
          Rodrigo Schmidt added a comment - Uploading a new patch.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          +1 The latest patch looks good.

          Show
          Tsz Wo Nicholas Sze added a comment - +1 The latest patch looks good.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12407940/HADOOP-5762.2.patch
          against trunk revision 781178.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/452/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/452/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/452/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/452/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12407940/HADOOP-5762.2.patch against trunk revision 781178. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/452/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/452/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/452/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/452/console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          The failed tests do not seem related. I will commit this soon.

          Show
          Tsz Wo Nicholas Sze added a comment - The failed tests do not seem related. I will commit this soon.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I have committed this. Thanks, Rodrigo!

          Show
          Tsz Wo Nicholas Sze added a comment - I have committed this. Thanks, Rodrigo!

            People

            • Assignee:
              Rodrigo Schmidt
              Reporter:
              dhruba borthakur
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development