Hadoop Common
  1. Hadoop Common
  2. HADOOP-5472

Distcp does not support globbing of input paths

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The current version of distcp does not support globbing of input paths.

      1. HADOOP-5427.patch
        3 kB
        Rodrigo Schmidt
      2. DistcpGlob.txt
        1 kB
        dhruba borthakur

        Issue Links

          Activity

          dhruba borthakur created issue -
          Hide
          dhruba borthakur added a comment -

          Glob the input paths passed to the DistCp command.

          Show
          dhruba borthakur added a comment - Glob the input paths passed to the DistCp command.
          dhruba borthakur made changes -
          Field Original Value New Value
          Attachment DistcpGlob.txt [ 12402017 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          It seems that fs.globStatus(p) will not return null. If p does not exist, it returns an empty array. So, we only have to check whether inputs.length > 0. If inputs.length > 0, add all paths to the list. If inputs.length == 0, add an IOException to rslt (we don't have to check !fs.exists(p) again).

          Also, could you add a test?

          Show
          Tsz Wo Nicholas Sze added a comment - It seems that fs.globStatus(p) will not return null. If p does not exist, it returns an empty array. So, we only have to check whether inputs.length > 0. If inputs.length > 0, add all paths to the list. If inputs.length == 0, add an IOException to rslt (we don't have to check !fs.exists(p) again). Also, could you add a test?
          Nigel Daley made changes -
          Fix Version/s 0.20.0 [ 12313438 ]
          dhruba borthakur made changes -
          Assignee dhruba borthakur [ dhruba ] Rodrigo Schmidt [ rschmidt ]
          Hide
          Rodrigo Schmidt added a comment -

          An ArrayList might not be the best data structure to unglob the input sources, as the internal array will be re-sized for every new entry we add. A LinkedList would fit better.

          Show
          Rodrigo Schmidt added a comment - An ArrayList might not be the best data structure to unglob the input sources, as the internal array will be re-sized for every new entry we add. A LinkedList would fit better.
          Rodrigo Schmidt made changes -
          Attachment HADOOP-5427.patch [ 12408749 ]
          Rodrigo Schmidt made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Fix Version/s 0.21.0 [ 12313563 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12408749/HADOOP-5427.patch
          against trunk revision 778182.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/393/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/393/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/393/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/393/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12408749/HADOOP-5427.patch against trunk revision 778182. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/393/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/393/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/393/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/393/console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          +1 patch looks good.

          Show
          Tsz Wo Nicholas Sze added a comment - +1 patch looks good.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I have committed this. Thanks, Dhruba Borthakur and Rodrigo Schmidt!

          Show
          Tsz Wo Nicholas Sze added a comment - I have committed this. Thanks, Dhruba Borthakur and Rodrigo Schmidt!
          Tsz Wo Nicholas Sze made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Resolution Fixed [ 1 ]
          Rodrigo Schmidt made changes -
          Link This issue blocks HADOOP-5927 [ HADOOP-5927 ]
          Owen O'Malley made changes -
          Component/s tools/distcp [ 12312387 ]
          Tom White made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Hide
          Prashant Kommireddi added a comment -

          Anyway we could glob with 0.20.2?

          Show
          Prashant Kommireddi added a comment - Anyway we could glob with 0.20.2?
          Gavin made changes -
          Link This issue blocks MAPREDUCE-647 [ MAPREDUCE-647 ]
          Gavin made changes -
          Link This issue is depended upon by MAPREDUCE-647 [ MAPREDUCE-647 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          70d 16h 12m 1 Rodrigo Schmidt 21/May/09 23:30
          Patch Available Patch Available Resolved Resolved
          4d 19h 16m 1 Tsz Wo Nicholas Sze 26/May/09 18:46
          Resolved Resolved Closed Closed
          455d 2h 49m 1 Tom White 24/Aug/10 21:36

            People

            • Assignee:
              Rodrigo Schmidt
              Reporter:
              dhruba borthakur
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development