Description
A brief description from my colleague Stephen Fritz who helped discover it:
[root@node1 ~]# su - hdfs -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory -bash-4.1$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile All files are where we expect them...OK, let's try reading -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile My Test String <-- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String <-- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile My Test String <-- success! Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile' -bash-4.1$ exit logout [root@node1 ~]# su - testuser <-- lets try it as a different user: [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile My Test String <-- good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String <-- so far so good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.
Attachments
Attachments
Issue Links
- duplicates
-
HADOOP-8906 paths with multiple globs are unreliable
- Closed
- incorporates
-
HADOOP-9068 Reuse (and not duplicate) globbing logic between FileSystem and FileContext
- Open