Issue Details (XML | Word | Printable)

Key: HADOOP-4106
Type: New Feature New Feature
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Pete Wyckoff
Reporter: Pete Wyckoff
Votes: 0
Watchers: 3
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

add time, permission and user attribute support to fuse-dfs

Created: 08/Sep/08 03:48 AM   Updated: 08/Jul/09 05:05 PM
Return to search
Component/s: None
Affects Version/s: None
Fix Version/s: 0.19.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works HADOOP-4106.txt 2008-09-16 06:16 PM Pete Wyckoff 37 kB
Text File Licensed for inclusion in ASF works HADOOP-4106.txt 2008-09-16 01:42 AM Pete Wyckoff 36 kB
Text File Licensed for inclusion in ASF works HADOOP-4106.txt 2008-09-10 07:45 AM Pete Wyckoff 19 kB
Text File Licensed for inclusion in ASF works HADOOP-4106.txt 2008-09-08 03:53 AM Pete Wyckoff 12 kB
Issue Links:
Blocker
 
Duplicate
 
Incorporates
 

Hadoop Flags: Reviewed
Release Note: Added time, permission and user attribute support to libhdfs.
Resolution Date: 18/Sep/08 07:40 PM


 Description  « Hide
add:

dfs_chown
dfs_utime
dfs_chmod

Change open to have its own FS on writes (should we do this on reads too??) and use it for writes and disconnect when closing the file
Chane mkdir to open the FS itself and then close it
also added comments for dfs_access (which needs FileSystem support/libhdfs support) and I added the dfs_symlink and truncate since these 3 are the only 3 things left as far as functionality.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Pete Wyckoff added a comment - 08/Sep/08 03:53 AM
This is the code I wrote and compiles and is what I was thinking. May need some debugging and definitely needs unit tests but I though I would post what I have so far.

Pete Wyckoff added a comment - 08/Sep/08 04:07 AM
without implementing posix access method, fuse still won't respect permissions. I have opened a JIRA for the FileSystem to implement it and another for fuse to then implement it.

Craig Macdonald added a comment - 08/Sep/08 11:06 AM
Pete,
  • I dont think doConnectAsUser is correct. You are using the filesystem of the user running fuse_dfs, not the filesystem of the user making an IO operation through fuse_dfs. You should use the fuse_context to determine the uid of the calling user, convert to user/group names, and then pass these to libhdfs.
fuse_context * context = fuse_get_context ();
uid_t uid = context->uid;
gid_t uid = context->gid;
  • Could dfs_access be implemented by checking hdfsFileInfo?

Pete Wyckoff added a comment - 08/Sep/08 05:26 PM
yes, duh, I knew that but for some reason I must have been thinking or not thinking rather. I don't know why I did that

I was thinking I could, but that DFS should have it anyway really and might be better to implement all the rules there. If the DFS community doesn't want to do it (although it is a posix api) or cannot do it soon, I will implement it in libhdfs??


Pete Wyckoff added a comment - 10/Sep/08 07:45 AM
fix getuid/getgid issues and some bugs. Still doesn't pass the utimes test of touching a file and seeing the modification time updated. I'm not sure why as touch is definitely ending up calling utime, but the mtime doesn't seem to get updated.

probably just needs another short debugging session to complete the patch.


Pete Wyckoff added a comment - 12/Sep/08 06:38 PM
I can't make chown work yet. having trouble with getpwuid - returning:
struct passwd {
                    char    *pw_name;      /* user name */
                    char    *pw_passwd;    /* user password */
                    uid_t   pw_uid;        /* user id */
                    gid_t   pw_gid;        /* group id */
                    char    *pw_gecos;     /* real name */
                    char    *pw_dir;       /* home directory */
                    char    *pw_shell;     /* shell program */
              };

and pw_name is returning the password field in the /etc/passwd file instead of the login name.

could use help if anyone knows about these pwd.h apis.


Craig Macdonald added a comment - 12/Sep/08 08:39 PM
Pete,

1.
Is it for connectAsUser or chown that you are having the issue?

See my attachment for HADOOP-3536. This worked OK for me on Linux, and MacOSX IIRC.

2. Also, I was thinking about the gid_t uid = context->gid;

Surely this only gives the default group of the user? How would this work in my setting, where my default group represents my status, while most files are protected on a groups which are assigned on a project level basis?

In the RPC mechanism used by NFS, the following struct is used to pass authentication information from client to server (source: https://trac.eecs.iu-bremen.de/projects/scotty/browser/branches/tubs/compat/rpc/auth_unix.h?rev=885):

struct authunix_parms {
        u_long   aup_time;
        char    *aup_machname;
        int      aup_uid;
        int      aup_gid;
        u_int    aup_len;
        int     *aup_gids;
};

The point s that a process accessing a file in the NFS client has a uid and a gid. However, the NFS client supplements the gid with the other gids that the user belongs to. We should do likewise.


Pete Wyckoff added a comment - 16/Sep/08 01:42 AM - edited
fixes all the problems and I added all the groups using Craig's other patch.

ant test-contrib -Dfusedfs=1 -Dlibhdfs=1 -Dtestcase=TestFuseDFS

passes

(with hadoop-4014 patch applied - the patch for libhdfs to expose these apis).


Craig Macdonald added a comment - 16/Sep/08 12:55 PM - edited
Looks good.

1. What are the side-effects of not having dfs_access implemented?
2. Should HAVE_GETGROUPS be defined by the autoconf macro AC_FUNC_GETGROUPS?

C


Pete Wyckoff added a comment - 16/Sep/08 06:16 PM
added

AC_FUNC_GETGROUPS
AC_TYPE_GETGROUPS

to and check GETGROUPS_T in fuse_dfs.c as Craig suggested


Pete Wyckoff added a comment - 16/Sep/08 06:19 PM
Yes, getting everything else in other than actually respecting the ACLs is a little bit of a hollow victory

I'm not even sure there's complete consensus on https://issues.apache.org/jira/browse/HADOOP-4108. The fuse-dfs change is trivial as should the libhdfs change. Don't know if someone can get 4108 in on time for the 19 feature freeze on Friday I think.


Pete Wyckoff added a comment - 16/Sep/08 08:42 PM

Hi Craig,

I could add a #define HDFS_ACCESS_T with a call to an as yet unimplemented libhdfs API. It is a little confusing i agree to show permissions and be capable of changing them but the FS doesn't respect them.

Do you want HADOOP-4104 to be a blocker, or do you want to +1 this patch? I'm fine either way, as in some sense you are right it is confusing.

thanks, pete


Craig Macdonald added a comment - 16/Sep/08 08:49 PM
I dont see how this patch can be committed without HADOOP-4104. We know that it is blocked by it.

Wrt HADOOP-4108, I'm not sure. You need to get (a) a consensus, (b) an acceptable patch, and (c) the patch committed by Friday.
I think the correction question is whether HADOOP-4108 should be a blocker for this patch.

Does the fuse-dfs currently let me read/write files that I dont have permission to do so?

C


Allen Wittenauer added a comment - 16/Sep/08 09:03 PM - edited
I haven't looked at the patch, but how do you determine if the user is permitted to change the ownership/permission of a file without access() being implemented?

Pete Wyckoff added a comment - 16/Sep/08 09:20 PM
I am assuming the Hadoop API will prevent this and the user will get an IO error - (which has happened to me ), but I should add a unit test for this and see if there's a way to return -EPERM.

Allen Wittenauer added a comment - 16/Sep/08 10:57 PM
Now that I think about it, access() may not be necessary for chown()/chmod().

In particular, I'm thinking of a bug that happened in Solaris cpio (or was it pax?) eons ago where a user had a privilege assigned that allowed that user to chown() files. But that user couldn't actually read the files. So doing a raw chown() with no access check was the right thing to do.

If Hadoop were to mature enough to have a role system, doing access() followed by chown()/chmod() would be the wrong thing to do.

So, yes, I think it is ok to do chown()/chmod() without access() and just let the EPERM error or whatever Hadoop is throwing back to the user.

Of course, this begs the question about what FUSE will do under roles, but that's a different discussion for a different day.


Pete Wyckoff added a comment - 17/Sep/08 05:02 AM

Does the fuse-dfs currently let me read/write files that I dont have permission to do so?

Yes, it does unconditionally. So, this patch will improve things. the only thing is you will get an IO error rather than a correct error message saying you don't have permissions. So, in that respect, this patch is an improvement. Albeit, getting an IO error rather than a permission error is a little weird. I guess the question is whether it's preferable to let people see files unconditionally or get an IO error when they do??

I think getting 4108 committed by Friday isn't possible.

An alternative would be to assume -EPERM when libhdfs gives an error, but that would be pretty hacky.


Hadoop QA added a comment - 17/Sep/08 05:11 AM
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12390211/HADOOP-4106.txt
against trunk revision 696130.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3280/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3280/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3280/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3280/console

This message is automatically generated.


Pete Wyckoff added a comment - 17/Sep/08 06:40 PM

I am basically +1 on committing this w/o access, so could others vote +1 or -1 on committing this w/o being blocked by having the access method? (ie HADOOP-4108).

Again the problem with this patch is people will get an I/O error when accessing a file they don't have permission to rather than a permission error. w/o this patch, there are no access privileges as fuse always connects as the super user.

So, I think the patch is an improvement.

peter


Craig Macdonald added a comment - 17/Sep/08 07:22 PM
+1. Commit.

C


Zheng Shao added a comment - 18/Sep/08 07:40 PM
Committed. Thanks Pete.
Committed revision 696777.

Hudson added a comment - 22/Sep/08 03:18 PM