Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      If I have an empty directory /testdir1 and then I run the command bin/hadoop distcp /testdir1 /testdir2, the command completes successfully, but does not create the empty directory /testdir2.

      1. HADOOP-5762.2.patch
        3 kB
        Rodrigo Schmidt
      2. HADOOP-5762.patch
        3 kB
        Rodrigo Schmidt

        Issue Links

          Activity

          dhruba borthakur created issue -
          Hide
          Rodrigo Schmidt added a comment - - edited

          The problem happens if any of the source paths is an empty directory. It's because source directories are not "visited" in the inner loop that checks for files and directories to be copied and, thus, are not added to _distcp_src_files.

          Show
          Rodrigo Schmidt added a comment - - edited The problem happens if any of the source paths is an empty directory. It's because source directories are not "visited" in the inner loop that checks for files and directories to be copied and, thus, are not added to _distcp_src_files.
          Rodrigo Schmidt made changes -
          Field Original Value New Value
          Status Open [ 1 ] Patch Available [ 10002 ]
          Fix Version/s site [ 12312854 ]
          Rodrigo Schmidt made changes -
          Attachment HADOOP-5762.patch [ 12407581 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Hi Rodrigo, which version is your patch for? Does this problem still exist on trunk?

          Show
          Tsz Wo Nicholas Sze added a comment - Hi Rodrigo, which version is your patch for? Does this problem still exist on trunk?
          Hide
          Rodrigo Schmidt added a comment -

          The problem still exists on trunk.

          The patch was generated based on trunk.

          Show
          Rodrigo Schmidt added a comment - The problem still exists on trunk. The patch was generated based on trunk.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12407581/HADOOP-5762.patch
          against trunk revision 772960.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/313/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12407581/HADOOP-5762.patch against trunk revision 772960. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/313/console This message is automatically generated.
          Rodrigo Schmidt made changes -
          Attachment HADOOP-5762.patch [ 12407581 ]
          Hide
          Rodrigo Schmidt added a comment -

          Just realized I had uploaded the wrong patch. I'm correcting it now.

          Show
          Rodrigo Schmidt added a comment - Just realized I had uploaded the wrong patch. I'm correcting it now.
          Rodrigo Schmidt made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Rodrigo Schmidt added a comment -

          Correct patch.

          Show
          Rodrigo Schmidt added a comment - Correct patch.
          Rodrigo Schmidt made changes -
          Attachment HADOOP-5762.patch [ 12407713 ]
          Hide
          Rodrigo Schmidt added a comment -

          The root directory was not being included in the list of files/directories to be copied.

          Besides, copying was aborted if no data was being copied (only directories or empty files). The correct test should be on the number of files being copied.

          Show
          Rodrigo Schmidt added a comment - The root directory was not being included in the list of files/directories to be copied. Besides, copying was aborted if no data was being copied (only directories or empty files). The correct test should be on the number of files being copied.
          Rodrigo Schmidt made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12407713/HADOOP-5762.patch
          against trunk revision 772960.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/319/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/319/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/319/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/319/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12407713/HADOOP-5762.patch against trunk revision 772960. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/319/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/319/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/319/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/319/console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -
          +    return srcCount > 0;
          

          Checking srcCount may not work well since srcCount counts all src paths. It will be >0 even there is nothing to copy.

          We probably need a new varible dirCount and then return fileCount > 0 || dirCount > 0.

          Show
          Tsz Wo Nicholas Sze added a comment - + return srcCount > 0; Checking srcCount may not work well since srcCount counts all src paths. It will be >0 even there is nothing to copy. We probably need a new varible dirCount and then return fileCount > 0 || dirCount > 0.
          Hide
          Rodrigo Schmidt added a comment -

          You are right, Nicholas. What about using fileCount for that and just add empty directories to it?

          Show
          Rodrigo Schmidt added a comment - You are right, Nicholas. What about using fileCount for that and just add empty directories to it?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > What about using fileCount for that and just add empty directories to it?
          fileCount is also used for the -filelimit option which does not include dirs.

          For creating directories, we don't really need a job. Why don't we make these directories during setup? i.e. mkdir instead of appending them to src_writer/dst_writer. What do you think?

          Show
          Tsz Wo Nicholas Sze added a comment - > What about using fileCount for that and just add empty directories to it? fileCount is also used for the -filelimit option which does not include dirs. For creating directories, we don't really need a job. Why don't we make these directories during setup? i.e. mkdir instead of appending them to src_writer/dst_writer. What do you think?
          Hide
          Rodrigo Schmidt added a comment -

          It could be done. Actually, my first attempt to fix this bug did exactly that. Then I noticed that method copy() was already programmed to copy empty directories and, except for the root source, directories were being added to src_writer on method setup. I think it's a good design option to let copy be the only method that actually writes something to the destination. This makes the code simpler and more elegant (thus, more maintainable).

          I'm don't like the idea of creating only the root src directory (in case it's empty) on setup(). It just doesn't look good. And creating all empty src subdirectories on setup might force us to do a lot of extra checking on setup to cope with failures (checks that already exist on copy()). I don't think many people will be using distcp to copy empty directories and it doesn't look like the performance gain will compensate the loss in code simplicity.

          Show
          Rodrigo Schmidt added a comment - It could be done. Actually, my first attempt to fix this bug did exactly that. Then I noticed that method copy() was already programmed to copy empty directories and, except for the root source, directories were being added to src_writer on method setup. I think it's a good design option to let copy be the only method that actually writes something to the destination. This makes the code simpler and more elegant (thus, more maintainable). I'm don't like the idea of creating only the root src directory (in case it's empty) on setup(). It just doesn't look good. And creating all empty src subdirectories on setup might force us to do a lot of extra checking on setup to cope with failures (checks that already exist on copy()). I don't think many people will be using distcp to copy empty directories and it doesn't look like the performance gain will compensate the loss in code simplicity.
          Rodrigo Schmidt made changes -
          Status Patch Available [ 10002 ] In Progress [ 3 ]
          Hide
          Rodrigo Schmidt added a comment -

          Uploading a new patch.

          Show
          Rodrigo Schmidt added a comment - Uploading a new patch.
          Rodrigo Schmidt made changes -
          Attachment HADOOP-5762.2.patch [ 12407940 ]
          Johan Oskarsson made changes -
          Fix Version/s 0.21.0 [ 12313563 ]
          Fix Version/s site [ 12312854 ]
          Rodrigo Schmidt made changes -
          Status In Progress [ 3 ] Patch Available [ 10002 ]
          Tsz Wo Nicholas Sze made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          +1 The latest patch looks good.

          Show
          Tsz Wo Nicholas Sze added a comment - +1 The latest patch looks good.
          Tsz Wo Nicholas Sze made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hadoop Flags [Reviewed]
          Rodrigo Schmidt made changes -
          Link This issue blocks HADOOP-5927 [ HADOOP-5927 ]
          Tsz Wo Nicholas Sze made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Tsz Wo Nicholas Sze made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12407940/HADOOP-5762.2.patch
          against trunk revision 781178.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/452/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/452/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/452/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/452/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12407940/HADOOP-5762.2.patch against trunk revision 781178. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/452/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/452/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/452/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/452/console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          The failed tests do not seem related. I will commit this soon.

          Show
          Tsz Wo Nicholas Sze added a comment - The failed tests do not seem related. I will commit this soon.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I have committed this. Thanks, Rodrigo!

          Show
          Tsz Wo Nicholas Sze added a comment - I have committed this. Thanks, Rodrigo!
          Tsz Wo Nicholas Sze made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Tsz Wo Nicholas Sze made changes -
          Link This issue is related to HADOOP-6053 [ HADOOP-6053 ]
          dhruba borthakur made changes -
          Link This issue relates to MAPREDUCE-689 [ MAPREDUCE-689 ]
          Owen O'Malley made changes -
          Component/s tools/distcp [ 12312387 ]
          Tom White made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Gavin made changes -
          Link This issue blocks MAPREDUCE-647 [ MAPREDUCE-647 ]
          Gavin made changes -
          Link This issue is depended upon by MAPREDUCE-647 [ MAPREDUCE-647 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Patch Available Patch Available In Progress In Progress
          2d 22h 24m 1 Rodrigo Schmidt 12/May/09 22:42
          In Progress In Progress Patch Available Patch Available
          14d 9h 18m 1 Rodrigo Schmidt 27/May/09 08:01
          Patch Available Patch Available Open Open
          7d 12h 18m 3 Tsz Wo Nicholas Sze 01/Jun/09 18:30
          Open Open Patch Available Patch Available
          6d 21h 15m 4 Tsz Wo Nicholas Sze 01/Jun/09 18:30
          Patch Available Patch Available Resolved Resolved
          2d 3h 41m 1 Tsz Wo Nicholas Sze 03/Jun/09 22:11
          Resolved Resolved Closed Closed
          446d 22h 25m 1 Tom White 24/Aug/10 20:37

            People

            • Assignee:
              Rodrigo Schmidt
              Reporter:
              dhruba borthakur
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development