Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
2.4.1
-
None
Description
Whenever I try to copy data from local to a cluster, but forget to create the parent directory first, I get a very confusing error message:
$ whoami fs111 $ hadoop fs -ls /user Found 2 items drwxr-xr-x - fs111 supergroup 0 2014-08-11 20:17 /user/hive drwxr-xr-x - vagrant supergroup 0 2014-08-11 19:15 /user/vagrant $ hadoop fs -copyFromLocal data data copyFromLocal: `data': No such file or directory
From the error message, you would say that the local "data" directory is not existing, but that is not the case. What is missing is the "/user/fs111" directory on HDFS. After I created it, the copyFromLocal command works fine.
I believe the error message is confusing and should at least be fixed. What would be even better, if hadoop could restore the old behaviour in 1.x, where copyFromLocal would just create the directories, if they are missing.
Attachments
Attachments
- HADOOP-10965.001.patch
- 13 kB
- John Zhuge
- HADOOP-10965.002.patch
- 11 kB
- John Zhuge
Issue Links
- breaks
-
HDFS-10228 TestHDFSCLI fails
- Resolved
- is related to
-
HADOOP-12971 FileSystemShell doc should explain relative path
- Resolved
Activity
+1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 0s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 2 new or modified test files. |
0 | mvndep | 0m 15s | Maven dependency ordering for branch |
+1 | mvninstall | 6m 44s | trunk passed |
+1 | compile | 6m 36s | trunk passed with JDK v1.8.0_66 |
+1 | compile | 7m 5s | trunk passed with JDK v1.7.0_91 |
+1 | checkstyle | 0m 23s | trunk passed |
+1 | mvnsite | 1m 9s | trunk passed |
+1 | mvneclipse | 0m 13s | trunk passed |
+1 | findbugs | 1m 49s | trunk passed |
+1 | javadoc | 0m 52s | trunk passed with JDK v1.8.0_66 |
+1 | javadoc | 1m 4s | trunk passed with JDK v1.7.0_91 |
0 | mvndep | 0m 8s | Maven dependency ordering for patch |
+1 | mvninstall | 0m 40s | the patch passed |
+1 | compile | 6m 10s | the patch passed with JDK v1.8.0_66 |
+1 | javac | 6m 10s | the patch passed |
+1 | compile | 6m 48s | the patch passed with JDK v1.7.0_91 |
+1 | javac | 6m 48s | the patch passed |
+1 | checkstyle | 0m 23s | the patch passed |
+1 | mvnsite | 1m 0s | the patch passed |
+1 | mvneclipse | 0m 13s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 2m 3s | the patch passed |
+1 | javadoc | 0m 52s | the patch passed with JDK v1.8.0_66 |
+1 | javadoc | 1m 2s | the patch passed with JDK v1.7.0_91 |
+1 | unit | 7m 9s | hadoop-common in the patch passed with JDK v1.8.0_66. |
+1 | unit | 7m 15s | hadoop-common in the patch passed with JDK v1.7.0_91. |
+1 | asflicense | 0m 23s | Patch does not generate ASF License warnings. |
61m 34s |
Subsystem | Report/Notes |
---|---|
Docker | Image:yetus/hadoop:0ca8df7 |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12785465/HADOOP-10965.001.patch |
JIRA Issue | |
Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle |
uname | Linux 8c5fd6ea96a3 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
git revision | trunk / 2673cba |
Default Java | 1.7.0_91 |
Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91 |
findbugs | v3.0.0 |
JDK v1.7.0_91 Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/8501/testReport/ |
modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common |
Max memory used | 76MB |
Powered by | Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org |
Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/8501/console |
This message was automatically generated.
Is the problem that the defaultFS was not set, and thus "hadoop fs -ls /" is listing the local machine instead of HDFS? If so, you can try turning on HADOOP-12043, though I admit there's irony in saying incorrect configuration needs to be fixed via configuration.
What we print now is pretty similar to what happens on a local FS, e.g.:
andrew@fortuna [06:34:42] [~] -> % mkdir a andrew@fortuna [06:34:44] [~] -> % ls a/b/c ls: cannot access a/b/c: No such file or directory
There are two working directories in copyFromLocal, so you need to check in two places. Maybe we could also provide a fully-qualified path to help disambiguate? i.e. you'd see something like:
copyFromLocal: `data': No such file or directory: hdfs://nameservice/user/fs111/data
On second look it's not a defaultFS problem, the issue is about the working directory of the shell being the user's HDFS home dir, and the homedir hasn't been created.
This is the same error you get working on local filesystems (as shown in the example in my above comment) so I'm inclined to close this as a WONTFIX.
Hi andrew.wang,
Thanks for reviewing.
I did a test
yzhang@localhost tmp]$ cp abc x/y
cp: cannot create regular file `x/y': No such file or directory
[yzhang@localhost tmp]$ cp abc ~/x/y
cp: cannot create regular file `/home/yzhang/x/y': No such file or directory
[yzhang@localhost tmp]$
The second test indicates that if the concerned path is relevant to home, the msg does expand to full path.
I think in the reported case, we have the same issue, it's actually checking the user home dir in HDFS, so it seems worthwhile to print out the expanded path with the user home dir in HDFS.
Does it make sense to you?
Thanks.
It comes down to relative and absolute paths. In your first example, you provided a relative path, so it prints a relative path in the error message. In the second, "~/x/y" is an absolute path that expands to "/home/yzhang/x/y" so that absolute path is printed.
In this reported case, it's your first example rather than the second. The cwd of the hadoop shell happens to be the user's home dir, but the user is specifying a relative path as the argument. Basically this:
# cd ~ # ls does/not/exist
For the command "fs -put f1 f1" when default is dfs, and the user home does not exist, there are 3 ways to print error message.
1. Existing code
put: f1: No such file or directory
2. Print absolute path
put: hdfs://namenode:port/user/jack/f1: No such file or directory
3. Patch 001 - Print a different error message
put: f1: Parent directory not found
4. 2+3
put: hdfs://namenode:port/user/jack/f1: Parent directory not found
Which way to go?
Thanks Andrew. I saw your point now.
As long as user is aware of that HDFS treats relative path relative to the user's home directory (effectively, there is a hidden "~/" for all relative paths), then it's good.
I looked the doc page https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html, there is no mentioning of "relative path", I think we can create a doc jira to fix that.
Thanks.
I am ok either way. Any user with basic HDFS knowledge would only be puzzled for a few minutes then realize "f1" means the target and the relative path indicates an implicit user home dir. The fix is trying to be more helpful than the standard Linux behavior.
Thanks guys. I feel like the error message should have the same path as the original user input. We could also provide the qualified path as a debugging assist, but alongside the original path.
Overall though I'm leaning toward doc JIRA though. Sounds like we need to explain CWD behavior.
Thanks all for the good discussion.
Can Hadoop admins change "dfs.user.home.dir.prefix"?
If so, how do Hadoop users know what is the path for user home? Or what is CWD if he/she decides to use relative paths?
Is it worthwhile to add a "hdfs dfs -pwd"?
andrew.wang, do see your point of printing the same path in error message as input. BTW, the absolute paths often expand to long strings which clutter the console a little bit.
andrew.wang and yzhangal, the most common and perplexing mistake a new HDFS user can make is having forgot to create the home directory. It could even take a seasoned user a little while to realize the mistake. How about triggering a new HomeDirectoryNotFoundException when accessing a relative path and home directory having not been created? The exception should print the home directory path for 2 reasons: 1) Some user may not know the format '/user/<name>'; 2) The admin might have changed the home directory path template (very unlikely though).
For example:
$ fs -put f1 f1
put: f1: Home directory '/user/jack' not found
So the issue is not homedir specific, it relates to the CWD (current working directory), which by default happens to be the homedir.
As I suggested above, why not just add the fully qualified path to the error message? The path could be missing on either the local FS or in HDFS, and showing the fully qualified path would address both cases without ambiguity.
Thanks andrew.wang, it sounds good. Is it ok if I change the output a little to this?
$ hdfs dfs -put f1 f1
put: f1 (hdfs://namenode:port/user/jack/f1): No such file or directory
And only use this format when path != fqPath.
I'd prefer the format that I used above, the parens are a bit ugly, and having the fq path there crowds out the provided path. Also it seems like this format lost the quotes around f1?
Patch 002:
- Fix FS shell command copy and touchz to print the fully qualified path in
the error message when the path can not be found. It will show the current
directory and file system which can help users take proper corrective actions.
Test output:
$ hdfs dfs -touchz f1 touchz: `f1': No such file or directory: `hdfs://nnhost:8020/user/systest/f1' $ hdfs dfs -put d1 d1 put: `d1': No such file or directory: `hdfs://nnhost:8020/user/systest/d1'
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 16s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 2 new or modified test files. |
+1 | mvninstall | 7m 7s | trunk passed |
+1 | compile | 6m 29s | trunk passed with JDK v1.8.0_74 |
+1 | compile | 6m 54s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 22s | trunk passed |
+1 | mvnsite | 0m 59s | trunk passed |
+1 | mvneclipse | 0m 13s | trunk passed |
+1 | findbugs | 1m 40s | trunk passed |
+1 | javadoc | 0m 58s | trunk passed with JDK v1.8.0_74 |
+1 | javadoc | 1m 8s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 43s | the patch passed |
+1 | compile | 6m 25s | the patch passed with JDK v1.8.0_74 |
+1 | javac | 6m 25s | the patch passed |
+1 | compile | 6m 54s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 6m 54s | the patch passed |
+1 | checkstyle | 0m 21s | the patch passed |
+1 | mvnsite | 1m 1s | the patch passed |
+1 | mvneclipse | 0m 14s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 1m 52s | the patch passed |
+1 | javadoc | 0m 58s | the patch passed with JDK v1.8.0_74 |
+1 | javadoc | 1m 4s | the patch passed with JDK v1.7.0_95 |
-1 | unit | 7m 10s | hadoop-common in the patch failed with JDK v1.8.0_74. |
-1 | unit | 7m 8s | hadoop-common in the patch failed with JDK v1.7.0_95. |
-1 | asflicense | 0m 23s | Patch generated 2 ASF License warnings. |
61m 27s |
Reason | Tests |
---|---|
JDK v1.8.0_74 Timed out junit tests | org.apache.hadoop.util.TestNativeLibraryChecker |
JDK v1.7.0_95 Timed out junit tests | org.apache.hadoop.util.TestNativeLibraryChecker |
This message was automatically generated.
Pushed to trunk, branch-2, branch-2.8. Thank you John for fixing this long-standing issue!
FAILURE: Integrated in Hadoop-trunk-Commit #9514 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9514/)
HADOOP-10965. Print fully qualified path in CommandWithDestination error (wang: rev 8bfaa80037365c0790083313a905d1e7d88b0682)
- hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/PathIOException.java
- hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestFsShellTouch.java
- hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java
- hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Touch.java
- hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestFsShellCopy.java
Parch 001
error message when the parent directory does not exist.