Hadoop Common
  1. Hadoop Common
  2. HADOOP-8963

CopyFromLocal doesn't always create user directory

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Trivial Trivial
    • Resolution: Fixed
    • Affects Version/s: 1.0.3
    • Fix Version/s: 1.2.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When you use the command "hadoop fs -copyFromLocal filename ." before the /user/username directory has been created, the file is created with name /user/username instead of a directory being created with file /user/username/filename. The command "hadoop fs -copyFromLocal filename filename" works as expected, creating /user/username and /user/username/filename, and "hadoop fs -copyFromLocal filename ." works as expected if the /user/username directory already exists.

      1. HADOOP-8963.branch-1.patch
        7 kB
        Suresh Srinivas
      2. HADOOP-8963.branch-1.patch
        7 kB
        Arpit Gupta
      3. HADOOP-8963.branch-1.patch
        4 kB
        Arpit Gupta
      4. HADOOP-8963.branch-1.patch
        4 kB
        Arpit Gupta
      5. HADOOP-8963.branch-1.patch
        4 kB
        Arpit Gupta

        Activity

        Hide
        Arpit Gupta added a comment -

        moved to hadoop common as the bug is in FileUtil.checkDest it returns null as the destination with "." is used as the destination and users home dir does not exist. If the file name is used as the destination the file name is returned. I changed the function to call the checkDest again with srcName as the new destination.

        Show
        Arpit Gupta added a comment - moved to hadoop common as the bug is in FileUtil.checkDest it returns null as the destination with "." is used as the destination and users home dir does not exist. If the file name is used as the destination the file name is returned. I changed the function to call the checkDest again with srcName as the new destination.
        Hide
        Arpit Gupta added a comment -

        here is the patch for branch-1 with tests added. testCopyFromLocal1 will fails without this patch.

        Show
        Arpit Gupta added a comment - here is the patch for branch-1 with tests added. testCopyFromLocal1 will fails without this patch.
        Hide
        Arpit Gupta added a comment -

        Looks like trunk does not have this issue and it does not seem to be doing the same in checkDest in FileUtil. I will try to debug how this issue is resolved in trunk and upload a patch with similar solution for branch-1

        Show
        Arpit Gupta added a comment - Looks like trunk does not have this issue and it does not seem to be doing the same in checkDest in FileUtil. I will try to debug how this issue is resolved in trunk and upload a patch with similar solution for branch-1
        Hide
        Daryn Sharp added a comment -

        The shell has been overhauled (by me) on trunk so you can't just backmerge a change. For various reasons, FsShell in trunk is not using FileUtil to copy the files. What someone may want to do is backport the entire shell. It's probably a dropin replacement although it would introduce some incompatibilities - yet fix a ton of bugs related to globbing and exit codes - which is why it hasn't been done. I suppose that other than output changes, it would/should be moderately difficult to revert commands to 1.x "compatible" behavior.

        Show
        Daryn Sharp added a comment - The shell has been overhauled (by me) on trunk so you can't just backmerge a change. For various reasons, FsShell in trunk is not using FileUtil to copy the files. What someone may want to do is backport the entire shell. It's probably a dropin replacement although it would introduce some incompatibilities - yet fix a ton of bugs related to globbing and exit codes - which is why it hasn't been done. I suppose that other than output changes, it would/should be moderately difficult to revert commands to 1.x "compatible" behavior.
        Hide
        Arpit Gupta added a comment -

        @Daryn

        Thanks for info. I was actually wrong and this problem exists on trunk as well.

        ...skipping...
        2012-10-23 09:16:57,838 INFO  hdfs.TestDFSShell (TestDFSShell.java:runCmd(835)) - RUN: -copyFromLocal /Users/arpit/github/hadoop/trunk/hadoop-common/hadoop-hdfs-project/hadoop-hdfs/target/test/data/testCopyFromLocal1.txt .
        2012-10-23 09:16:57,859 DEBUG ipc.Client (Client.java:sendParam(844)) - IPC Client (2029445125) connection to localhost/127.0.0.1:56770 from arpit sending #10
        2012-10-23 09:16:57,859 DEBUG ipc.Server (Server.java:processData(1614)) -  got #10
        2012-10-23 09:16:57,860 DEBUG ipc.Server (Server.java:run(1726)) - IPC Server handler 1 on 56770: has Call#10for RpcKind RPC_PROTOCOL_BUFFER from 127.0.0.1:56779
        2012-10-23 09:16:57,860 DEBUG security.UserGroupInformation (UserGroupInformation.java:logPrivilegedAction(1400)) - PrivilegedAction as:arpit (auth:SIMPLE) from:org.apache.hadoop.ipc.Server$Handler.run(Server.java:1742)
        2012-10-23 09:16:57,860 DEBUG security.Groups (Groups.java:getGroups(83)) - Returning cached groups for 'arpit'
        2012-10-23 09:16:57,862 INFO  FSNamesystem.audit (FSNamesystem.java:logAuditEvent(272)) - allowed=true  ugi=arpit (auth:SIMPLE) ip=/127.0.0.1   cmd=getfileinfo src=/user/arpit dst=null        perm=null
        

        src is "/user/arpit" rather than "/user/arpit/testCopyFromLocal1.txt" when the user uses '.' as the destination and the home dir does not exist.

        I will try to track down where the issue is.

        Show
        Arpit Gupta added a comment - @Daryn Thanks for info. I was actually wrong and this problem exists on trunk as well. ...skipping... 2012-10-23 09:16:57,838 INFO hdfs.TestDFSShell (TestDFSShell.java:runCmd(835)) - RUN: -copyFromLocal /Users/arpit/github/hadoop/trunk/hadoop-common/hadoop-hdfs-project/hadoop-hdfs/target/test/data/testCopyFromLocal1.txt . 2012-10-23 09:16:57,859 DEBUG ipc.Client (Client.java:sendParam(844)) - IPC Client (2029445125) connection to localhost/127.0.0.1:56770 from arpit sending #10 2012-10-23 09:16:57,859 DEBUG ipc.Server (Server.java:processData(1614)) - got #10 2012-10-23 09:16:57,860 DEBUG ipc.Server (Server.java:run(1726)) - IPC Server handler 1 on 56770: has Call#10for RpcKind RPC_PROTOCOL_BUFFER from 127.0.0.1:56779 2012-10-23 09:16:57,860 DEBUG security.UserGroupInformation (UserGroupInformation.java:logPrivilegedAction(1400)) - PrivilegedAction as:arpit (auth:SIMPLE) from:org.apache.hadoop.ipc.Server$Handler.run(Server.java:1742) 2012-10-23 09:16:57,860 DEBUG security.Groups (Groups.java:getGroups(83)) - Returning cached groups for 'arpit' 2012-10-23 09:16:57,862 INFO FSNamesystem.audit (FSNamesystem.java:logAuditEvent(272)) - allowed= true ugi=arpit (auth:SIMPLE) ip=/127.0.0.1 cmd=getfileinfo src=/user/arpit dst= null perm= null src is "/user/arpit" rather than "/user/arpit/testCopyFromLocal1.txt" when the user uses '.' as the destination and the home dir does not exist. I will try to track down where the issue is.
        Hide
        Daryn Sharp added a comment -

        I think that's the file stat to see if the parent dir exists. I would expect a file copy to "." to fail if the home dir doesn't exist, but you are saying it's creating /user/<local-file>? That's something I specifically fixed awhile back!

        Show
        Daryn Sharp added a comment - I think that's the file stat to see if the parent dir exists. I would expect a file copy to "." to fail if the home dir doesn't exist, but you are saying it's creating /user/<local-file>? That's something I specifically fixed awhile back!
        Hide
        Arpit Gupta added a comment -

        @Daryn

        Once again you are right . I had a bug in my test where i was not asserting the exit code of the run cmd. copyFromLocal does indeed fail if the home dir does not exist in trunk.

        I will bring up a pseudo distributed cluster with 3.0 and try out various commands and update the jira.

        Show
        Arpit Gupta added a comment - @Daryn Once again you are right . I had a bug in my test where i was not asserting the exit code of the run cmd. copyFromLocal does indeed fail if the home dir does not exist in trunk. I will bring up a pseudo distributed cluster with 3.0 and try out various commands and update the jira.
        Hide
        Arpit Gupta added a comment -

        Confirmed on a pseudo distributed cluster that if the home dir does not exist the copyFromLocal command fails.

        Show
        Arpit Gupta added a comment - Confirmed on a pseudo distributed cluster that if the home dir does not exist the copyFromLocal command fails.
        Hide
        Arpit Gupta added a comment -

        updated the tests in branch-1 to assert the exit code.

        Show
        Arpit Gupta added a comment - updated the tests in branch-1 to assert the exit code.
        Hide
        Daryn Sharp added a comment -

        I'd consider immediately setting the dst to be the srcName when dst is the cwd. This will should save you another loop through checkDest.

        Show
        Daryn Sharp added a comment - I'd consider immediately setting the dst to be the srcName when dst is the cwd. This will should save you another loop through checkDest.
        Hide
        Arpit Gupta added a comment -

        @Daryn

        That was initial plan that i would just return a path with srcName when dst was null, but i changed in case what if the new path existed. But now i realize that wont be the case as this will happen only when dst fs was not found.

        Show
        Arpit Gupta added a comment - @Daryn That was initial plan that i would just return a path with srcName when dst was null, but i changed in case what if the new path existed. But now i realize that wont be the case as this will happen only when dst fs was not found.
        Hide
        Arpit Gupta added a comment -

        Here is the test patch output

         [exec] 
             [exec] BUILD SUCCESSFUL
             [exec] Total time: 4 minutes 56 seconds
             [exec] 
             [exec] 
             [exec] 
             [exec] 
             [exec] -1 overall.  
             [exec] 
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec] 
             [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
             [exec] 
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec] 
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec] 
             [exec]     -1 findbugs.  The patch appears to introduce 9 new Findbugs (version 1.3.9) warnings.
        

        Findbugs warnings are not related to this patch

        Show
        Arpit Gupta added a comment - Here is the test patch output [exec] [exec] BUILD SUCCESSFUL [exec] Total time: 4 minutes 56 seconds [exec] [exec] [exec] [exec] [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] -1 findbugs. The patch appears to introduce 9 new Findbugs (version 1.3.9) warnings. Findbugs warnings are not related to this patch
        Hide
        Daryn Sharp added a comment -

        Should it at least chain off the if? Otherwise I think it might overwrite when it shouldn't. Ie. something like:

          if (dstFS.exists(dst)) {
            ...
        + } else if (dst.toString().isEmpty()) {
        +   checkDest(null, dstFS, new Path(srcName), overwrite);
          }
          return dst;
        
        Show
        Daryn Sharp added a comment - Should it at least chain off the if? Otherwise I think it might overwrite when it shouldn't. Ie. something like: if (dstFS.exists(dst)) { ... + } else if (dst.toString().isEmpty()) { + checkDest( null , dstFS, new Path(srcName), overwrite); } return dst;
        Hide
        Daryn Sharp added a comment -

        Please add a few tests cases to make sure the overwrite flag is honored.

        Show
        Daryn Sharp added a comment - Please add a few tests cases to make sure the overwrite flag is honored.
        Hide
        Arpit Gupta added a comment -

        Thanks for the feedback Daryn. I will incorporate it and add more tests.

        Show
        Arpit Gupta added a comment - Thanks for the feedback Daryn. I will incorporate it and add more tests.
        Hide
        Arpit Gupta added a comment -

        Attached a new patch where the check is made as part of the else if statement.

        Also added more tests to test the copy methods and the overwrite flags.

        Show
        Arpit Gupta added a comment - Attached a new patch where the check is made as part of the else if statement. Also added more tests to test the copy methods and the overwrite flags.
        Hide
        Arpit Gupta added a comment -

        Added tests where when using fsshell and the dst exists that and cmd fails, and also added tests for the FileUtil.copy with overwrite flags.

        Show
        Arpit Gupta added a comment - Added tests where when using fsshell and the dst exists that and cmd fails, and also added tests for the FileUtil.copy with overwrite flags.
        Hide
        Arpit Gupta added a comment -

        test patch output

        exec] BUILD SUCCESSFUL
             [exec] Total time: 5 minutes 6 seconds
        
             [exec] 
             [exec] 
             [exec] 
             [exec] 
             [exec] -1 overall.  
             [exec] 
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec] 
             [exec]     +1 tests included.  The patch appears to include 6 new or modified tests.
             [exec] 
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec] 
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec] 
             [exec]     -1 findbugs.  The patch appears to introduce 9 new Findbugs (version 1.3.9) warnings.
             [exec] 
             [exec] 
             [exec] 
             [exec] 
             [exec] ======================================================================
             [exec] ======================================================================
             [exec]     Finished build.
             [exec] ======================================================================
             [exec] ======================================================================
             [exec] 
             [exec] 
        

        Findbugs warnings are not related to this patch.

        Show
        Arpit Gupta added a comment - test patch output exec] BUILD SUCCESSFUL [exec] Total time: 5 minutes 6 seconds [exec] [exec] [exec] [exec] [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 6 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] -1 findbugs. The patch appears to introduce 9 new Findbugs (version 1.3.9) warnings. [exec] [exec] [exec] [exec] [exec] ====================================================================== [exec] ====================================================================== [exec] Finished build. [exec] ====================================================================== [exec] ====================================================================== [exec] [exec] Findbugs warnings are not related to this patch.
        Hide
        Daryn Sharp added a comment -

        +1

        Show
        Daryn Sharp added a comment - +1
        Hide
        Suresh Srinivas added a comment -

        Minor update to the patch to change the comment format to doc comments.

        Show
        Suresh Srinivas added a comment - Minor update to the patch to change the comment format to doc comments.
        Hide
        Suresh Srinivas added a comment -

        Committed the patch to branch-1.

        Thank you Arpit.

        Show
        Suresh Srinivas added a comment - Committed the patch to branch-1. Thank you Arpit.
        Hide
        Matt Foley added a comment -

        Closed upon release of Hadoop 1.2.0.

        Show
        Matt Foley added a comment - Closed upon release of Hadoop 1.2.0.

          People

          • Assignee:
            Arpit Gupta
            Reporter:
            Billie Rinaldi
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development