Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
-
Reviewed
Description
RawLocalFs#getFileLinkStatus does not actually get the owner and mode of the symlink, but instead uses the owner and mode of the symlink target. If the target can't be found, it fills in bogus values (the empty string and FsPermission.getDefault) for these.
Symlinks have an owner distinct from the owner of the target they point to, and getFileLinkStatus ought to expose this.
In some operating systems, symlinks can have a permission other than 0777. We ought to expose this in RawLocalFilesystem and other places, although we don't necessarily have to support this behavior in HDFS.
Attachments
Attachments
- 0001-temporarily-disable-HADOOP-9652.patch
- 1 kB
- Colin McCabe
- hadoop-9452-1.patch
- 29 kB
- Andrew Wang
- hadoop-9652-2.patch
- 27 kB
- Andrew Wang
- hadoop-9652-3.patch
- 27 kB
- Andrew Wang
- hadoop-9652-4.patch
- 33 kB
- Andrew Wang
- hadoop-9652-5.patch
- 33 kB
- Andrew Wang
- hadoop-9652-6.patch
- 33 kB
- Andrew Wang
- hadoop-9652-workaround.patch
- 2 kB
- Andrew Wang
Issue Links
- breaks
-
MAPREDUCE-5584 ShuffleHandler becomes unresponsive during gridmix runs and can leak file descriptors
- Resolved
-
HADOOP-9782 Datanode daemon cannot be started on OS X
- Resolved
-
HADOOP-9981 globStatus should minimize its listStatus and getFileStatus calls
- Closed
- incorporates
-
HADOOP-9693 Shell should add a probe for OSX
- Closed
- is cloned by
-
HADOOP-10276 RawLocalFs#getFileLinkStatus does not fill in the link owner and mode by default
- Open
- is duplicated by
-
HADOOP-9783 Fix OS detection for RawLocalFileSystem#getFileLinkStatus fallback path
- Resolved
-
HADOOP-9788 Add sticky bit support to org.apache.hadoop.fs.Stat
- Resolved
-
MAPREDUCE-5430 TestMRApps#testSetClasspathWithArchives is failing
- Resolved
- is related to
-
HADOOP-9371 Define Semantics of FileSystem more rigorously
- Closed
-
HADOOP-9788 Add sticky bit support to org.apache.hadoop.fs.Stat
- Resolved
-
HADOOP-9769 Remove org.apache.hadoop.fs.Stat when JDK6 support is dropped
- Patch Available
- relates to
-
HADOOP-9783 Fix OS detection for RawLocalFileSystem#getFileLinkStatus fallback path
- Resolved
-
HADOOP-8040 Add symlink support to FileSystem
- Closed
Activity
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12591965/hadoop-9452-1.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
-1 release audit. The applied patch generated 1 release audit warnings.
+1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2774//testReport/
Release audit warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/2774//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2774//console
This message is automatically generated.
Looks good. I am glad to see that DelegateToFileSystem is now delegating the symlinks methods-- good idea.
- assertEquals(FsPermission.getDefault(), fsd.getPermission());
This test case should be still applicable, right? Symlinks created by the test user should be owned by him.
+ public static final OSType osType; + + static { + osType = getOSType(); + }
This can al be combined into one line.
Thanks Colin.
Symlinks created by the test user should be owned by him.
True, I added the username check back in. The other tests are unfortunately harder; for instance, symlink permissions are inconsistent across operating systems, and it's unclear to me which of the user's groups is set.
Fixed your other comment too.
I also realized the earlier patch removed Windows support, so I brought the old functions back as a fallback. I think it'd be better to eventually add Windows support to the Stat class so we can remove the old code. Though it should be fine, I haven't tested this on windows either, so maybe cnauroth or ivanmi could verify since we'd like to get this into 2.1.
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12592092/hadoop-9652-2.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2778//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2778//console
This message is automatically generated.
These tests have not been working correctly on a Windows local file system, even before this patch. arpitagarwal has been doing most of the coding to get them working in the context of HADOOP-9527. The HADOOP-9527 patch needs a rebase though to reconcile it against recent refactoring work.
Cool, thanks for taking a peek Chris. Just to further clarify, the v1 patch would have broken getFileStatus and getFileLinkStatus for local Windows filesystems, so I figure the v2 still makes sense even if the local Windows symlink tests are currently broken.
I think the v2 patch is thus ready for further review or commit (still applies to trunk for me). Already has a +1 from Jenkins above.
True, I added the username check back in. The other tests are unfortunately harder; for instance, symlink permissions are inconsistent across operating systems, and it's unclear to me which of the user's groups is set.
Newly created files get set to the user's primary group on UNIX.
+ FileStatus status = fsImpl.getFileLinkStatus(f); + // FileSystem#getFileLinkStatus qualifies the link target + // AbstractFileSystem needs to return it plain since it's qualified + // in FileContext, so re-get and set the plain target + if (status.isSymlink()) { + status.setSymlink(fsImpl.getLinkTarget(f)); + } + return status;
This actually highlight something that I think we might need to fix in the FileSystem symlink support-- the fact that it's impossible to view relative symlink paths once they are created.
Let's say I create a symlink from a -> ./b in my local HDFS.
With the new FSShell support in HDFS-4019, Hadoop displays this as:
lrwxrwxrwx - cmccabe supergroup 0 2013-07-23 15:59 /a -> hdfs://localhost:6000/b
It seems to me that we should display relative symlinks as relative, rather than absolute-izing them and adding a URI. Otherwise, if we use DistCp to copy a cluster's data to another cluster, all the symlink targets will be wrong. This is probably out of scope for this particular JIRA, but should we file a follow-up?
+ if (osType == OSType.OS_TYPE_UNIX) { + return new String[] { + "bash", "-c", + "exec 'stat' '" + derefFlag + "c' '%s,%F,%Y,%X,%a,%U,%G,%N' '" + + path + "' 2>&1" };
Shouldn't this be OS_TYPE_LINUX? I don't think, for example, OpenBSD comes with the GNU stat program. We should also really also file a follow-up bug to remove this and move to a more portable solution when we get the green light to drop JDK6 support.
Thanks for the review Colin, new patch attached.
Newly created files get set to the user's primary group on UNIX.
TY, added that check back in.
impossible to view relative symlink paths once they are created
I think FileSystem#getLinkTarget serves this purpose; targets are returned as set.
Shouldn't this be OS_TYPE_LINUX?
Truth, changed it to LINUX. It's not actually a positive Linux detection though, since it's the default OSType.
We should also really also file a follow-up bug to remove this and move to a more portable solution when we get the green light to drop JDK6 support.
Definitely agree on this, there are lots of places we shell out to workaround Java 6. I'll file a JIRA.
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12594059/hadoop-9652-3.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2840//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2840//console
This message is automatically generated.
I think FileSystem#getLinkTarget serves the purpose... [of returning targets as set].
Hmm. Not completely convinced, but that also isn't really relevant to this JIRA.
Thanks for creating HADOOP-9769 as well.
+1. thanks, Andrew.
I filed https://issues.apache.org/jira/browse/HADOOP-9780 to discuss the FileSystem#getLinkTarget issue, which is unrelated to this stuff
Tests also broke also on MacOSX because of this. Hoping HDFS-5039/HADOOP-9782 can be addressed quickly. Tx.
Hey folks, sorry about the breakage. I filed a fixup patch at HADOOP-9783, would appreciate if someone could test on their Mac environment.
It looks like this may have also broken distributed cache archive processing, as TestMRApps#testSetClasspathWithArchives is failing after this change. See MAPREDUCE-5430.
It appears to have also broken TestDFSShell#testCount, TestDFSShell#testText, and TestDFSShell#testFilePermissions. Checked out the checkin just prior (git:700817) and these test passed.
Whoa. Anytime you think you need to call "bash -c" you need rethink the change. User input should only be passed a direct arguments to execve. As this patch stands, any shell metachars in a path is going to produce unexpected and possible security issues.
I was originally led to this jira because my dev env broke and we've got people spinning cycles trying to debug new test failures.
Given all issues, I think we should consider reverting the change.
Also breaks TestHttpFSFileSystemLocalFileSystem#testOperationDoAs and TestHttpFSFileSystemLocalFileSystem#testOperation. Concur that we should consider reverting.
Let's just revert, this has caused way more fallout than I thought. I've tracked down most of the issues so far, but it seems cleaner to just do a single fixed patch than all these little JIRAs.
I'll also kick off full-project test builds on my own for future patches to this JIRA, because apparently the common tests are lacking here.
Daryn, that is a good catch. We definitely don't need to call bash -c here. It adds no value, and certainly could be a security vulnerability.
Let's just revert, this has caused way more fallout than I thought. I've tracked down most of the issues so far, but it seems cleaner to just do a single fixed patch than all these little JIRAs.
OK.
OK. I have reverted the patch.
Here is a summary of the issues we know of with the original patch:
- Fallback path should handle MacOS (
HADOOP-9783) - Issue with URI fragment handling and DistCp (
MAPREDUCE-5430) - Stray System.println caused some TestDFSShell failures
- must not use "bash -c"
So we need a new version of the patch which addresses all of these. Additionally, can you touch a file in each project (i.e., hadoop-hdfs-project, hadoop-mapreduce-project, hadoop-yarn-project), so that we get the full Jenkins run, rather than just a Jenkins run on common?
I have managed to isolate the problem we had with TestMRApps. The old code stripped the fragment in FileSystem#resolvePath; the new code does not.
Here's a unit test which only passes on the previous code:
@Test (timeout = 120000) public void testResolvePath() throws Exception { FileSystem fs = FileSystem.get(URI.create("file:///tmp/"), new Configuration()); URI preUri = new URI("file:///tmp#fragment"); Path pre = new Path(preUri); Path post = fs.resolvePath(pre); Assert.assertEquals("resolvePath did not strip the fragment", "file:/tmp", post.toString()); }
Ultimately this comes down to FileSystem#getFileStatus, since that's what resolvePath calls. Andrew, in your new getFileLinkStatusInternal, you currently just put the path passed by the user directly into the FileStatus object.
In contrast, the old code returned new Path(new File(originalPath.toUri().getPath())).getPath()).makeQualified(fs.getUri(), fs.getWorkingDirectory()). After all those conversions back and forth, I think:
- everything is lost except the "path string" – i.e. "/tmp"
- we fully qualify it
For compatibility reasons, I think we probably need to do the same-- strip off everything but the "path string" and fully qualify. I took a quick look at the HDFS (as opposed to LocalFileSystem) code, and it goes through a very different path. As far as I can see, however, DistributedFileSystem#getFileInfo will also strip the "fragment." So at least we are consistent.
trunk is reverted, but does this still need to be reverted from branch-2? Thanks!
cnauroth Please revert branch-2 also. All the test that I reported were failing on branch-2.
Hi everyone,
Here's a new patch that rolls up the previous issues. This includes:
- Stripping out URI fragments from Paths, with a new test for this behavior. This was causing the MR failure.
- Add sticky bit support and remove stray prints. This was breaking DFSShell tests.
- Clean up OS detection to fix Mac OSX test runs
- No more "bash -c". DF still does this, so I'll file another JIRA to improve this.
I've run the failed tests on my end, but would of course appreciate any further testing.
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12595479/hadoop-9652-4.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2905//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2905//console
This message is automatically generated.
I tested this on a Solaris derivative (OpenIndiana) and verified that the fallback path for LocalFileSystem worked. It works!
public static String[] getCheckProcessIsAliveCommand(String pid) { - return Shell.WINDOWS ? + return WINDOWS ? new String[] { Shell.WINUTILS, "task", "isAlive", pid } :
Let's make these cleanup changes in a separate JIRA to keep the diff down.
In RawLocalFileSystem, I think we should have a static final boolean that caches the information about whether we are using the fallback path. That also would make the code a little clearer-- it's otherwise not terribly clear why we do something different for linux and freebsd.
+ @Deprecated static class RawLocalFileStatus extends FileStatus { /* We can add extra fields here. It breaks at least CopyFiles.FilePair(). * We recognize if the information is already loaded by check if @@ -697,6 +706,7 @@ public void createSymlink(Path target, Path link, boolean createParent) * the given path does not refer to a symlink or there is an error * accessing the symlink. */ + @Deprecated private String readLink(Path p) {
Can we rename these something that indicates that they're part of the fallback path? I think RawLocalFileSystem#RawLocalFileStatus should be renamed to something like DeprecatedLocalFileStatus.
In general, everything that's only part of the fallback path should be somehow indicated.
I filed https://issues.apache.org/jira/browse/HDFS-5062 to address the existing bash -c in org.apache.hadoop.fs.Df
Okay, one more rev! I had to rebase since some windows fixes have gone in, so some attention would be good.
I went through and renamed the now-deprecated methods to deprecated*, and moved the OS support check into a final boolean like you suggested. Also tried to minimize the diff.
I again ran the tests that failed the last time this was committed, as well as a few in common/HDFS.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12597141/hadoop-9652-5.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
-1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2954//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/2954//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2954//console
This message is automatically generated.
These are the same metrics2 mbean findbugs warnings that have been cropping up recently, I believe spurious.
@Override public Path getLinkTarget(Path f) throws IOException { FileStatus fi = getFileLinkStatusInternal(f, false); // return an unqualified symlink target return fi.getSymlink(); }
Isn't this going to unconditionally call the Stat() code path, even on platforms where that is not supported? This also suggests to me that getFileLinkStatusInternal should be named something like getNativeFileLinkStatus, to avoid confusion.
Looks reasonable aside from that.
Thanks Colin, I cleaned up the calling structure in this patch and fixed the getLinkTarget issue. Public methods now all delegate to getFileLinkStatus, which then calls either the native or deprecated implementations as per platform support.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12597537/hadoop-9652-6.patch
against trunk revision .
-1 patch. Trunk compilation may be broken.
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2969//console
This message is automatically generated.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12597537/hadoop-9652-6.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
-1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2977//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/2977//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2977//console
This message is automatically generated.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12597844/hadoop-9652-6.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
-1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2978//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/2978//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2978//console
This message is automatically generated.
SUCCESS: Integrated in Hadoop-trunk-Commit #4260 (See https://builds.apache.org/job/Hadoop-trunk-Commit/4260/)
HADOOP-9652. RawLocalFs#getFileLinkStatus does not fill in the link owner and mode. (Andrew Wang via Colin Patrick McCabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514088)
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/DelegateToFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/HardLink.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Stat.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/local/RawLocalFs.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestLocalFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestStat.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestSymlinkLocalFS.java
SUCCESS: Integrated in Hadoop-Yarn-trunk #302 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/302/)
HADOOP-9652. RawLocalFs#getFileLinkStatus does not fill in the link owner and mode. (Andrew Wang via Colin Patrick McCabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514088)
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/DelegateToFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/HardLink.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Stat.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/local/RawLocalFs.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestLocalFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestStat.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestSymlinkLocalFS.java
FAILURE: Integrated in Hadoop-Hdfs-trunk #1492 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1492/)
HADOOP-9652. RawLocalFs#getFileLinkStatus does not fill in the link owner and mode. (Andrew Wang via Colin Patrick McCabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514088)
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/DelegateToFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/HardLink.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Stat.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/local/RawLocalFs.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestLocalFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestStat.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestSymlinkLocalFS.java
FAILURE: Integrated in Hadoop-Mapreduce-trunk #1519 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1519/)
HADOOP-9652. RawLocalFs#getFileLinkStatus does not fill in the link owner and mode. (Andrew Wang via Colin Patrick McCabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514088)
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/DelegateToFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/HardLink.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Stat.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/local/RawLocalFs.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestLocalFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestStat.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestSymlinkLocalFS.java
noting that this patch fixed HADOOP-9693 -add a check for OSX in shell
This change can cause ShuffleHandlers to become unresponsive when the cluster is under load because exists is now a very expensive call in RawLocalFileSystem. See MAPREDUCE-5584.
+1 for reverting this. Lets revisit this later with other symlink issues. sanjay.radia, please consider this in symlink discussions.
I think the easiest way to handle this is to enable the useLegacy thing. We don't need a full revert. Will post a patch shortly.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12609182/0001-temporarily-disable-HADOOP-9652.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common:
org.apache.hadoop.fs.TestSymlinkLocalFSFileSystem
org.apache.hadoop.fs.TestSymlinkLocalFSFileContext
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3235//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3235//console
This message is automatically generated.
I'm going to update it to keep using the new getFileLinkStatus in TestSymlinkLocalFSFileSystem. We don't want those tests to bitrot.
I figured we were applying this fix only to branch-2 as a stopgap for a proper fix using JNI later. Regardless, fixing the unit tests sounds like a good idea.
I'm having some problems running the unit tests on trunk as well. Have the symlink unit tests bitrotted already? Will take a look later.
Any update on this? I see branch-2 is still doing a ton of fork-n-exec overhead for fs.exists() on local files.
Thanks for reminding us of this issue Jason, we definitely need to get it fixed for 2.4. I took Colin's branch-2 patch and fixed the LocalFS tests, and also verified that we no longer see a bunch of execve's of stat with strace and the shell. It's be great if you could verify that this fixes your problem (and +1 and commit if you're comfortable).
+1, thanks Andrew! I plan on committing this but need to clear up something about the state of this JIRA. It looks like the reported problem will still exist, since getFileLinkStatus will not fill in the link owner and mode by default after the workaround patch. In that sense the cleanest thing would be to simply revert and commit a real fix later, but it looks like the last committed patch did more than just fix the described problem (e.g.: also fixing HADOOP-9693). So I'm thinking we need to rename this JIRA to reflect it's laying the groundwork for the eventual fix and file a followup JIRA to track completing the fix. Does that make sense?
Sounds good to me, thanks Jason. I'm honestly okay with reverting symlink
support for local filesystem entirely since they seem to be more trouble
than they're worth, but that's a discussion we can have at a later time.
Updating summary to reflect this is adding the ability to get a link's owner and mode but only if requested. Filed HADOOP-10276 to track the fix for the original reported issue.
Thanks Andrew! I committed the addendum patch to trunk and branch-2.
SUCCESS: Integrated in Hadoop-trunk-Commit #5036 (See https://builds.apache.org/job/Hadoop-trunk-Commit/5036/)
Addendum patch for HADOOP-9652 to fix performance problems. Contributed by Andrew Wang (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561038)
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestSymlinkLocalFS.java
FAILURE: Integrated in Hadoop-Yarn-trunk #461 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/461/)
Addendum patch for HADOOP-9652 to fix performance problems. Contributed by Andrew Wang (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561038)
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestSymlinkLocalFS.java
FAILURE: Integrated in Hadoop-Mapreduce-trunk #1678 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1678/)
Addendum patch for HADOOP-9652 to fix performance problems. Contributed by Andrew Wang (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561038)
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestSymlinkLocalFS.java
SUCCESS: Integrated in Hadoop-Hdfs-trunk #1653 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1653/)
Addendum patch for HADOOP-9652 to fix performance problems. Contributed by Andrew Wang (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1561038)
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestSymlinkLocalFS.java
Patch attached.
I took the approach of shelling out to stat(1) for local filesystems when doing getFileStatus and getFileLinkStatus. Besides fixing the link owner and mode issues, this has the added bonus of showing the correct local mod/access times. Overall, I think this will be more robust than the previous approach.
I also took the opportunity to clean up a bunch of code in DelegateToFileSystem and RawLocalFs. With symlink support in FileSystem, we can simply delegate in almost all cases.
This was tested on Linux, and I plugged some strings from the FreeBSD stat(1) into the new test case (though didn't actually run the tests on FreeBSD).