Details
Description
After merging HDFS-4685 to trunk, some dev environments are experiencing a failure to parse a permission string in TestWebHDFS. The problem appears to occur only in environments with security extensions enabled on the local file system, such as Smack or ACLs.
Attachments
Attachments
- HADOOP-10354.1.patch
- 0.8 kB
- Chris Nauroth
- HADOOP-10354.2.patch
- 2 kB
- Chris Nauroth
Issue Links
- is broken by
-
HDFS-4685 Implementation of ACLs in HDFS
- Closed
- relates to
-
HADOOP-10220 Add ACL indicator bit to FsPermission.
- Resolved
-
HDFS-5923 Do not persist the ACL bit in the FsPermission
- Resolved
Activity
Hi, yzhangal. Thank you for the bug report. I don't have a repro locally for this. I suspect that you're running with Smack enabled on your local file system. I believe the extra '.' indicator in the permission string indicates the presence of a Smack label. For Jing, it would be a similar situation with '+' appended to files with an ACL on the local file system.
I suspect I know the root cause of this bug. I'll post a patch later tonight.
Hi Jing.
Good to know that you also saw the same problem. I guess most developers take the default setting of ACL. I think it would be nice if the unit test is self-contained so its success/failure is not subject to which env setting we are running. So I wonder if the test itself can be modified to loose the ACL restriction, rather than to change the setting of a specific machine.
What do you think?
Thanks.
Hi Chris,
I was writing my last update and just saw yours. Cool you figured out the root cause of this bug! Thanks for following-up!
--Yongjun
I'm attaching a patch with what I suspect is the fix. yzhangal or jingzhao, would one of you please try the patch in your environment to see if it works? I'm not set up for a repro at the moment.
Here is a bit of background on this. HADOOP-10220 enhanced FsPermission on the feature branch to add an ACL bit, indicating if the file has an ACL. We then realized that the ACL bit had some problematic effects on the rest of the design, so we reverted it in HDFS-5923 and did something else. However, we forgot to revert a change I had made in RawLocalFileSystem. Previously, there had been a special case for ignoring the extra character on the permission string for local file systems using Smack or ACLs. We can simply restore the old version of that code now that the ACL bit is gone.
Jenkins will -1 this patch for no new tests, but there isn't any way to write a reliable test for this, since it depends on the semantics of the dev environment's underlying local file system.
Hi Chris,
Thanks a lot for your quick patch.
I actually have tried to hack
public static FsPermission valueOf(String unixSymbolicPermission) {
earlier today to allow length 11 and it worked for me.
I just tried your patch, and it works too.
One suggestion, I think we can define "10" as a constant somewhere so it can be
shared by multiple files involved for this implementation. I saw it appears multiple
times at different places.
Thanks.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12630219/HADOOP-10354.1.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. There were no new javadoc warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3592//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3592//console
This message is automatically generated.
Moving the stack trace provided by Yongjun to the comment section:
HI,
I'm seeing trunk branch test failure locally (centOs6) today. And I identified it's this commit that caused the failure.
Author: Chris Nauroth <cnauroth@apache.org> 2014-02-19 10:34:52
Committer: Chris Nauroth <cnauroth@apache.org> 2014-02-19 10:34:52
Parent: 7215d12fdce727e1f4bce21a156b0505bd9ba72a (YARN-1666. Modified RM HA handling of include/exclude node-lists to be available across RM failover by making using of a remote configuration-provider. Contributed by Xuan Gong.)
Parent: 603ebb82b31e9300cfbf81ed5dd6110f1cb31b27 (HDFS-4685. Correct minor whitespace difference in FSImageSerialization.java in preparation for trunk merge.)
Child: ef8a5bceb7f3ce34d08a5968777effd40e0b1d0f (YARN-1171. Add default queue properties to Fair Scheduler documentation (Naren Koneru via Sandy Ryza))
Branches: remotes/apache/HDFS-5535, remotes/apache/trunk, testv10, testv3, testv4, testv7
Follows: testv5
Precedes:
Merge HDFS-4685 to trunk.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1569870 13f79535-47bb-0310-9956-ffa450edef68
I'm not sure whether other folks are seeing the same, or maybe related to my environment. But prior to chis change, I don't see this problem.
The failures are in TestWebHDFS:
Running org.apache.hadoop.hdfs.web.TestWebHDFS
Tests run: 5, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 3.687 sec <<< FAILURE! - in org.apache.hadoop.hdfs.web.TestWebHDFS
testLargeDirectory(org.apache.hadoop.hdfs.web.TestWebHDFS) Time elapsed: 2.478 sec <<< ERROR!
java.lang.IllegalArgumentException: length != 10(unixSymbolicPermission=drwxrwxr-x.)
at org.apache.hadoop.fs.permission.FsPermission.valueOf(FsPermission.java:323)
at org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:572)
at org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:540)
at org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:129)
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:146)
at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:1835)
at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:1877)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1859)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1764)
at org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1243)
at org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:699)
at org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:359)
at org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:340)
at org.apache.hadoop.hdfs.web.TestWebHDFS.testLargeDirectory(TestWebHDFS.java:229)
testNamenodeRestart(org.apache.hadoop.hdfs.web.TestWebHDFS) Time elapsed: 0.342 sec <<< ERROR!
java.lang.IllegalArgumentException: length != 10(unixSymbolicPermission=drwxrwxr-x.)
at org.apache.hadoop.fs.permission.FsPermission.valueOf(FsPermission.java:323)
at org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:572)
at org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:540)
at org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:129)
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:146)
at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:1835)
at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:1877)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1859)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1764)
at org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1243)
at org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:699)
at org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:359)
at org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:340)
at org.apache.hadoop.hdfs.TestDFSClientRetries.namenodeRestartTest(TestDFSClientRetries.java:886)
at org.apache.hadoop.hdfs.web.TestWebHDFS.testNamenodeRestart(TestWebHDFS.java:216)
......
Thanks.
Thanks for the review, Yongjun. I'm attaching v2, which refactors to use a shared constant. How does this look?
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12630325/HADOOP-10354.2.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. There were no new javadoc warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3594//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3594//console
This message is automatically generated.
HI Chris,
Thanks for addressing my comments. It looks good to me. I also ran the same test with you new patch for sanity, it passes.
+1.
Thanks again, Yongjun.
jingzhao, does the new patch look good to you?
I committed this to trunk. Thank you to Yongjun for reporting the issue. Thank you to both Yongjun and Jing for code reviews.
SUCCESS: Integrated in Hadoop-trunk-Commit #5206 (See https://builds.apache.org/job/Hadoop-trunk-Commit/5206/)
HADOOP-10354. TestWebHDFS fails after merge of HDFS-4685 to trunk. Contributed by Chris Nauroth. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1570655)
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/permission/FsPermission.java
SUCCESS: Integrated in Hadoop-Yarn-trunk #489 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/489/)
HADOOP-10354. TestWebHDFS fails after merge of HDFS-4685 to trunk. Contributed by Chris Nauroth. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1570655)
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/permission/FsPermission.java
FAILURE: Integrated in Hadoop-Hdfs-trunk #1681 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1681/)
HADOOP-10354. TestWebHDFS fails after merge of HDFS-4685 to trunk. Contributed by Chris Nauroth. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1570655)
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/permission/FsPermission.java
SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1706 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1706/)
HADOOP-10354. TestWebHDFS fails after merge of HDFS-4685 to trunk. Contributed by Chris Nauroth. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1570655)
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java
- /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/permission/FsPermission.java
I've met the same test failure also. The cause of the test failure in my machine is that I enabled ACL in my macbook. After disabling it the test passed.
But in the meanwhile, do we want to loose the check a little bit for the unit test?