Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-4329

DFSShell issues with directories with spaces in name

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0, 0.23.10, 2.1.1-beta
    • Fix Version/s: 0.23.10, 2.3.0
    • Component/s: hdfs-client
    • Labels:
      None

      Description

      This bug was discovered by Casey Ching.

      The command dfs -put /foo/hello.txt dir is supposed to create dir/hello.txt on HDFS. It doesn't work right if "dir" has a space in it:

      [adi@haus01 ~]$ hdfs dfs -mkdir 'space cat'
      [adi@haus01 ~]$ hdfs dfs -put /etc/motd 'space cat'
      [adi@haus01 ~]$ hdfs dfs -cat 'space cat/motd'
      cat: `space cat/motd': No such file or directory
      [adi@haus01 ~]$ hdfs dfs -ls space\*
      Found 1 items
      -rw-r--r--   2 adi supergroup        251 2012-12-20 11:16 space%2520cat/motd
      [adi@haus01 ~]$ hdfs dfs -cat 'space%20cat/motd'
      Welcome to Ubuntu 12.04.1 LTS (GNU/Linux 3.2.0-30-generic x86_64)
      ...
      

      Note that the dfs -ls output wrongly encodes the wrongly encoded directory name, turning %20 into %2520. It does the same thing with space:

      [adi@haus01 ~]$ hdfs dfs -touchz 'space cat/foo'
      [adi@haus01 ~]$ hdfs dfs -ls 'space cat'
      Found 1 items
      -rw-r--r--   2 adi supergroup          0 2012-12-20 11:36 space%20cat/foo
      
      1. 4329.trunk.v3.patch
        17 kB
        Cristina L. Abad
      2. 4329.branch-0.23.v3.patch
        16 kB
        Cristina L. Abad
      3. 4329.trunk.v2.patch
        9 kB
        Cristina L. Abad
      4. 4329.branch-2.patch
        9 kB
        Cristina L. Abad
      5. 4329.trunk.patch
        10 kB
        Cristina L. Abad
      6. 4329.branch-0.23.patch
        10 kB
        Cristina L. Abad

        Issue Links

          Activity

          Hide
          Eli Collins added a comment -

          Updating fix version to reflect that this did not make v2 GA.

          Show
          Eli Collins added a comment - Updating fix version to reflect that this did not make v2 GA.
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #1528 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1528/)
          HDFS-4329. DFSShell issues with directories with spaces in name (Cristina L. Abad via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1516904)

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/cli/util/CommandExecutor.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1528 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1528/ ) HDFS-4329 . DFSShell issues with directories with spaces in name (Cristina L. Abad via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1516904 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/cli/util/CommandExecutor.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Hdfs-trunk #1501 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1501/)
          HDFS-4329. DFSShell issues with directories with spaces in name (Cristina L. Abad via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1516904)

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/cli/util/CommandExecutor.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Show
          Hudson added a comment - SUCCESS: Integrated in Hadoop-Hdfs-trunk #1501 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1501/ ) HDFS-4329 . DFSShell issues with directories with spaces in name (Cristina L. Abad via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1516904 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/cli/util/CommandExecutor.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-0.23-Build #709 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/709/)
          HDFS-4329. DFSShell issues with directories with spaces in name (Cristina L. Abad via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1516928)

          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/cli/util/CommandExecutor.java
          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-0.23-Build #709 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/709/ ) HDFS-4329 . DFSShell issues with directories with spaces in name (Cristina L. Abad via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1516928 ) /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/cli/util/CommandExecutor.java /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #311 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/311/)
          HDFS-4329. DFSShell issues with directories with spaces in name (Cristina L. Abad via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1516904)

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/cli/util/CommandExecutor.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #311 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/311/ ) HDFS-4329 . DFSShell issues with directories with spaces in name (Cristina L. Abad via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1516904 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/cli/util/CommandExecutor.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in Hadoop-trunk-Commit #4315 (See https://builds.apache.org/job/Hadoop-trunk-Commit/4315/)
          HDFS-4329. DFSShell issues with directories with spaces in name (Cristina L. Abad via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1516904)

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/cli/util/CommandExecutor.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Show
          Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #4315 (See https://builds.apache.org/job/Hadoop-trunk-Commit/4315/ ) HDFS-4329 . DFSShell issues with directories with spaces in name (Cristina L. Abad via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1516904 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PathData.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/cli/util/CommandExecutor.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
          Hide
          Jonathan Eagles added a comment -

          Putting this change in before the patches go stale. Thanks, Cristina L. Abad!

          Show
          Jonathan Eagles added a comment - Putting this change in before the patches go stale. Thanks, Cristina L. Abad !
          Hide
          Daryn Sharp added a comment -

          +1 Looks good! An aside, I spoke to Arun and he confirmed we can put this into 2.1.1.

          Show
          Daryn Sharp added a comment - +1 Looks good! An aside, I spoke to Arun and he confirmed we can put this into 2.1.1.
          Hide
          Cristina L. Abad added a comment -

          Findbug warnings related to class org.apache.hadoop.metrics2.lib.DefaultMetricsSystem were not introduced by this patch.

          Show
          Cristina L. Abad added a comment - Findbug warnings related to class org.apache.hadoop.metrics2.lib.DefaultMetricsSystem were not introduced by this patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12597760/4329.trunk.v3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4813//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/4813//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4813//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12597760/4329.trunk.v3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4813//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/4813//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4813//console This message is automatically generated.
          Hide
          Cristina L. Abad added a comment -

          Daryn: thanks for suggesting adding those tests! It turns out the scheme-qualified type of paths was broken in the branch 2 and trunk patches. Attaching new patches for 23 and trunk (trunk one also works for branch 2) with the following changes: (1) added 5 more unit tests (relative path, scheme-qualified, and absolute/relative/scheme-qualified with globbing); and (2) the patch for trunk/branch-2 fixes the problem with decoding scheme-qualified paths.

          Show
          Cristina L. Abad added a comment - Daryn: thanks for suggesting adding those tests! It turns out the scheme-qualified type of paths was broken in the branch 2 and trunk patches. Attaching new patches for 23 and trunk (trunk one also works for branch 2) with the following changes: (1) added 5 more unit tests (relative path, scheme-qualified, and absolute/relative/scheme-qualified with globbing); and (2) the patch for trunk/branch-2 fixes the problem with decoding scheme-qualified paths.
          Hide
          Cristina L. Abad added a comment -

          Per Daryn's suggestion, added more unit test.

          Show
          Cristina L. Abad added a comment - Per Daryn's suggestion, added more unit test.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12597153/4329.trunk.v2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.hdfs.web.TestWebHdfsTimeouts
          org.apache.hadoop.hdfs.server.datanode.TestDeleteBlockPool

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4795//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/4795//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4795//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12597153/4329.trunk.v2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.web.TestWebHdfsTimeouts org.apache.hadoop.hdfs.server.datanode.TestDeleteBlockPool +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4795//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/4795//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4795//console This message is automatically generated.
          Hide
          Daryn Sharp added a comment -

          Looks good. Please add few more tests to exercise all the path variants: relative, absolute, scheme-qualified. We've had problems with all three not always working correctly.

          • relative: "a path with/whitespaces in directories"
          • absolute - you did that
          • scheme-qualified: "NAMENODE/a path with/whitespaces in directories"

          Also test that all 3 cases work the same with globs. Ie. try to list "..../a path*"

          Show
          Daryn Sharp added a comment - Looks good. Please add few more tests to exercise all the path variants: relative, absolute, scheme-qualified. We've had problems with all three not always working correctly. relative: "a path with/whitespaces in directories" absolute - you did that scheme-qualified: "NAMENODE/a path with/whitespaces in directories" Also test that all 3 cases work the same with globs. Ie. try to list "..../a path*"
          Hide
          Cristina L. Abad added a comment -

          It's been a month since I posted this patch, so I checked to make sure that it has not gone stale. The 23 patch is fine, and tests are still passing. The trunk patch still works too, but I am re-posting it because the previous one had paths to the files that were specific to my own installation; the new patch does not have this issue. I am also posting a patch for branch 2, which is a copy of the trunk patch; tests pass on branch 2 too.

          Show
          Cristina L. Abad added a comment - It's been a month since I posted this patch, so I checked to make sure that it has not gone stale. The 23 patch is fine, and tests are still passing. The trunk patch still works too, but I am re-posting it because the previous one had paths to the files that were specific to my own installation; the new patch does not have this issue. I am also posting a patch for branch 2, which is a copy of the trunk patch; tests pass on branch 2 too.
          Hide
          Cristina L. Abad added a comment -

          Attaching a patch for trunk. Some comments: (1) I did not do your suggested regexp change since it would require me to strip the quotes in the result of matcher.group(1), so it reduces some complexity in the regex and if statement but adds complexity to remove the quotes; (2) the changes to CommandExecutor. java and testHDFSConf.xml were straightforward, but changing PathData.java was more complicated since this class has a fix not available in 23 to deal with Windows paths. Due to this fix, uriToString cannot be static and thus cannot be called from expandAsGlob. I left expandAsGlob as is, as the getStringForChildPath change seems to be enough to fix the whitespace encoding. The tests are passing. Should I change the 23 patch to be more similar to the trunk patch?

          Show
          Cristina L. Abad added a comment - Attaching a patch for trunk. Some comments: (1) I did not do your suggested regexp change since it would require me to strip the quotes in the result of matcher.group(1), so it reduces some complexity in the regex and if statement but adds complexity to remove the quotes; (2) the changes to CommandExecutor. java and testHDFSConf.xml were straightforward, but changing PathData.java was more complicated since this class has a fix not available in 23 to deal with Windows paths. Due to this fix, uriToString cannot be static and thus cannot be called from expandAsGlob. I left expandAsGlob as is, as the getStringForChildPath change seems to be enough to fix the whitespace encoding. The tests are passing. Should I change the 23 patch to be more similar to the trunk patch?
          Hide
          Daryn Sharp added a comment -

          Looks good! Minor comment is I think you can reduce the regexp from "(...)|(...)|(...)" to "(...|...|...)" and then always use matcher.group(1) instead of the conditionals for group number.

          Please post the trunk/2 patch. It'll probably be the same except for the test xml file.

          Show
          Daryn Sharp added a comment - Looks good! Minor comment is I think you can reduce the regexp from "(...)|(...)|(...)" to "(...|...|...)" and then always use matcher.group(1) instead of the conditionals for group number. Please post the trunk/2 patch. It'll probably be the same except for the test xml file.
          Hide
          Cristina L. Abad added a comment -

          Here is a patch for 0.23, including a unit test. I'll submit patches for trunk and branch 2 next week.

          Show
          Cristina L. Abad added a comment - Here is a patch for 0.23, including a unit test. I'll submit patches for trunk and branch 2 next week.
          Hide
          Cristina L. Abad added a comment -

          Andy, are you working on this issue?

          We have a patch for 0.23 that fixes this problem. If you are not working on this, I can port the patch to branch 2 and trunk and post the patches here.

          Show
          Cristina L. Abad added a comment - Andy, are you working on this issue? We have a patch for 0.23 that fixes this problem. If you are not working on this, I can port the patch to branch 2 and trunk and post the patches here.

            People

            • Assignee:
              Cristina L. Abad
              Reporter:
              Andy Isaacson
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development