Hadoop Common
  1. Hadoop Common
  2. HADOOP-5472

Distcp does not support globbing of input paths

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The current version of distcp does not support globbing of input paths.

      1. HADOOP-5427.patch
        3 kB
        Rodrigo Schmidt
      2. DistcpGlob.txt
        1 kB
        dhruba borthakur

        Issue Links

          Activity

          Hide
          dhruba borthakur added a comment -

          Glob the input paths passed to the DistCp command.

          Show
          dhruba borthakur added a comment - Glob the input paths passed to the DistCp command.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          It seems that fs.globStatus(p) will not return null. If p does not exist, it returns an empty array. So, we only have to check whether inputs.length > 0. If inputs.length > 0, add all paths to the list. If inputs.length == 0, add an IOException to rslt (we don't have to check !fs.exists(p) again).

          Also, could you add a test?

          Show
          Tsz Wo Nicholas Sze added a comment - It seems that fs.globStatus(p) will not return null. If p does not exist, it returns an empty array. So, we only have to check whether inputs.length > 0. If inputs.length > 0, add all paths to the list. If inputs.length == 0, add an IOException to rslt (we don't have to check !fs.exists(p) again). Also, could you add a test?
          Hide
          Rodrigo Schmidt added a comment -

          An ArrayList might not be the best data structure to unglob the input sources, as the internal array will be re-sized for every new entry we add. A LinkedList would fit better.

          Show
          Rodrigo Schmidt added a comment - An ArrayList might not be the best data structure to unglob the input sources, as the internal array will be re-sized for every new entry we add. A LinkedList would fit better.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12408749/HADOOP-5427.patch
          against trunk revision 778182.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/393/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/393/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/393/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/393/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12408749/HADOOP-5427.patch against trunk revision 778182. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/393/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/393/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/393/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/393/console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          +1 patch looks good.

          Show
          Tsz Wo Nicholas Sze added a comment - +1 patch looks good.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I have committed this. Thanks, Dhruba Borthakur and Rodrigo Schmidt!

          Show
          Tsz Wo Nicholas Sze added a comment - I have committed this. Thanks, Dhruba Borthakur and Rodrigo Schmidt!
          Hide
          Prashant Kommireddi added a comment -

          Anyway we could glob with 0.20.2?

          Show
          Prashant Kommireddi added a comment - Anyway we could glob with 0.20.2?

            People

            • Assignee:
              Rodrigo Schmidt
              Reporter:
              dhruba borthakur
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development