Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Later
    • Affects Version/s: 0.15.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Only provide a security framework as described below. A simple implementation will be provided in HADOOP-2229.

      Previous Description

      In HADOOP-1298, we want to add user information and permission to the file system. It requires an authentication service and a user management service. We should provide a framework and a simple implementation in issue and extend it later. As discussed in HADOOP-1298, the framework should be extensible and pluggable.

      • Extensible: possible to extend the framework to the other parts (e.g. map-reduce) of Hadoop.
      • Pluggable: can easily switch security implementations. Below is a diagram borrowed from Java.

      • Implement a Hadoop authentication center (HAC). In the first step, the mechanism of HAC is very simple, it keeps track a list of usernames (we only support users, will work on other principals later) in HAC and verify username in user login (yeah, no password). HAC can run inside NameNode or run as a stand alone server. We will probably use Kerberos to provide more sophisticated authentication service.
      1. 1701_20071109.patch
        16 kB
        Tsz Wo Nicholas Sze

        Issue Links

          Activity

          Gavin made changes -
          Link This issue is depended upon by HADOOP-1741 [ HADOOP-1741 ]
          Gavin made changes -
          Link This issue blocks HADOOP-1741 [ HADOOP-1741 ]
          Owen O'Malley made changes -
          Component/s dfs [ 12310710 ]
          Nigel Daley made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Tsz Wo Nicholas Sze made changes -
          Resolution Later [ 7 ]
          Status Open [ 1 ] Resolved [ 5 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Seems that this won't be done in the near future. Closing this as "Later".

          Show
          Tsz Wo Nicholas Sze added a comment - Seems that this won't be done in the near future. Closing this as "Later".
          Tsz Wo Nicholas Sze made changes -
          Assignee Tsz Wo (Nicholas), SZE [ szetszwo ]
          Nigel Daley made changes -
          Fix Version/s 0.16.0 [ 12312740 ]
          Robert Chansler made changes -
          Component/s dfs [ 12310710 ]
          Raghu Angadi made changes -
          Link This issue blocks HADOOP-2184 [ HADOOP-2184 ]
          Hairong Kuang made changes -
          Link This issue blocks HADOOP-2229 [ HADOOP-2229 ]
          Hide
          Sanjay Radia added a comment -

          The concept of an abstract ticket along with the loginCredentials and Security factory in the attached patch can allow us to create a portable layer for plugging in different authentication technologies.

          Java already has such a portable plug-in layer for authentication and authrization: Java's Jaas and GSS APIs. If we can leverage these APIs then we will have free access to the various JAAS and GSS plug-in for Kerberos, LDAP, Unix etc. Furthermore I believe GSS is cross language.

          The problem is that we don't have the time to figure this all out how to exactly use the Java APIs for release 0.16 and we want to get permissions into 0.16.

          From what I have seen, I am not sure if the patch's ticket, loginCredentials and Security factory abstractions will help - they do not seem to fit into the java authentication APIs. Java provides some basic constructs: subject, loginModule and loginContext . Subject overlaps partly with our concept of ticket. The loginModule/Context overlaps with our security factory. If we use the Java APIs then the HDFS code mostly does not need to touch the "tickets" (except when we pass it to the job tracker).

          Of course we are free to ignore the entire java framework and build our own. But that is a big undertaking.
          I propose that we do NOT define the type Ticket and the loginCredentials and Security factory for now.

          For now the class called UserGroupInfo (see Hadoop 2229) which extends writable is sufficient.
          The userGroupInfo can be passed across at connection establishment.
          For example, for the socket creation, the socket factory can take a parameter of type writable:
          getClientSocketFactory(Writable authenticationInfo);

          The writable authenticationInfo can be written into the socket and read at the other end.
          Currently we simply need to make the UserGroupInfo to be writable.
          I don't think the rpc layer needs to do anything besides read and write the authenticationInfo (which is really the tickets).

          After we get some basic permission-checking feature into HDFS, lets try and see if we can fit this into the Java security framework. If we find that it does not then I suggest we define our own framework along the lines of this patch.

          Show
          Sanjay Radia added a comment - The concept of an abstract ticket along with the loginCredentials and Security factory in the attached patch can allow us to create a portable layer for plugging in different authentication technologies. Java already has such a portable plug-in layer for authentication and authrization: Java's Jaas and GSS APIs. If we can leverage these APIs then we will have free access to the various JAAS and GSS plug-in for Kerberos, LDAP, Unix etc. Furthermore I believe GSS is cross language. The problem is that we don't have the time to figure this all out how to exactly use the Java APIs for release 0.16 and we want to get permissions into 0.16. From what I have seen, I am not sure if the patch's ticket, loginCredentials and Security factory abstractions will help - they do not seem to fit into the java authentication APIs. Java provides some basic constructs: subject, loginModule and loginContext . Subject overlaps partly with our concept of ticket. The loginModule/Context overlaps with our security factory. If we use the Java APIs then the HDFS code mostly does not need to touch the "tickets" (except when we pass it to the job tracker). Of course we are free to ignore the entire java framework and build our own. But that is a big undertaking. I propose that we do NOT define the type Ticket and the loginCredentials and Security factory for now. For now the class called UserGroupInfo (see Hadoop 2229) which extends writable is sufficient. The userGroupInfo can be passed across at connection establishment. For example, for the socket creation, the socket factory can take a parameter of type writable: getClientSocketFactory(Writable authenticationInfo); The writable authenticationInfo can be written into the socket and read at the other end. Currently we simply need to make the UserGroupInfo to be writable. I don't think the rpc layer needs to do anything besides read and write the authenticationInfo (which is really the tickets). After we get some basic permission-checking feature into HDFS, lets try and see if we can fit this into the Java security framework. If we find that it does not then I suggest we define our own framework along the lines of this patch.
          Tsz Wo Nicholas Sze made changes -
          Link This issue blocks HADOOP-2229 [ HADOOP-2229 ]
          Tsz Wo Nicholas Sze made changes -
          Link This issue blocks HADOOP-2229 [ HADOOP-2229 ]
          Tsz Wo Nicholas Sze made changes -
          Link This issue blocks HADOOP-2229 [ HADOOP-2229 ]
          Tsz Wo Nicholas Sze made changes -
          Summary Provide a simple authentication service and a user management service Provide a security framework design
          Fix Version/s 0.16.0 [ 12312740 ]
          Description In HADOOP-1298, we want to add user information and permission to the file system. It requires an authentication service and a user management service. We should provide a framework and a simple implementation in issue and extend it later. As discussed in HADOOP-1298, the framework should be extensible and pluggable.

          - Extensible: possible to extend the framework to the other parts (e.g. map-reduce) of Hadoop.

          - Pluggable: can easily switch security implementations. Below is a diagram borrowed from Java.

          !http://java.sun.com/javase/6/docs/technotes/guides/security/overview/images/3.jpg!

          - Implement a Hadoop authentication center (HAC). In the first step, the mechanism of HAC is very simple, it keeps track a list of usernames (we only support users, will work on other principals later) in HAC and verify username in user login (yeah, no password). HAC can run inside NameNode or run as a stand alone server. We will probably use Kerberos to provide more sophisticated authentication service.
          Only provide a security framework as described below. A simple implementation will be provided in HADOOP-2229.

          h4._Previous Description_
          In HADOOP-1298, we want to add user information and permission to the file system. It requires an authentication service and a user management service. We should provide a framework and a simple implementation in issue and extend it later. As discussed in HADOOP-1298, the framework should be extensible and pluggable.

          - Extensible: possible to extend the framework to the other parts (e.g. map-reduce) of Hadoop.

          - Pluggable: can easily switch security implementations. Below is a diagram borrowed from Java.

          !http://java.sun.com/javase/6/docs/technotes/guides/security/overview/images/3.jpg!

          - Implement a Hadoop authentication center (HAC). In the first step, the mechanism of HAC is very simple, it keeps track a list of usernames (we only support users, will work on other principals later) in HAC and verify username in user login (yeah, no password). HAC can run inside NameNode or run as a stand alone server. We will probably use Kerberos to provide more sophisticated authentication service.
          Affects Version/s 0.15.0 [ 12312565 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > Why "login" is a method of AuthenticationModule but "logout" is a method of LoginCredential? Does it more natural to put both of them in AuthenticationModule?
          I think it is possible to put "logout" in AuthenticationModule.

          > Also it seems Principal has implemented all interface methods so it does not need to be an abstract class.
          It is good to keep Principal method since it is an abstract concept.

          Show
          Tsz Wo Nicholas Sze added a comment - > Why "login" is a method of AuthenticationModule but "logout" is a method of LoginCredential? Does it more natural to put both of them in AuthenticationModule? I think it is possible to put "logout" in AuthenticationModule. > Also it seems Principal has implemented all interface methods so it does not need to be an abstract class. It is good to keep Principal method since it is an abstract concept.
          Hide
          Hairong Kuang added a comment -

          One question about the API design. Why "login" is a method of AuthenticationModule but "logout" is a method of LoginCredential? Does it more natural to put both of them in AuthenticationModule? Also it seems Principal has implemented all interface methods so it does not need to be an abstract class.

          Show
          Hairong Kuang added a comment - One question about the API design. Why "login" is a method of AuthenticationModule but "logout" is a method of LoginCredential? Does it more natural to put both of them in AuthenticationModule? Also it seems Principal has implemented all interface methods so it does not need to be an abstract class.
          Tsz Wo Nicholas Sze made changes -
          Link This issue blocks HADOOP-2184 [ HADOOP-2184 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20071109.patch [ 12369260 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20071109.patch [ 12369257 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20071109.patch [ 12369257 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20071009api.patch [ 12367397 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20071009api.patch [ 12367397 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          API updated with current trunk

          Show
          Tsz Wo Nicholas Sze added a comment - API updated with current trunk
          Tsz Wo Nicholas Sze made changes -
          Attachment simple20070828.patch [ 12364700 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment guides20070828.pdf [ 12364698 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment design20070828.pdf [ 12364699 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070827c_framework.patch [ 12364655 ]
          Hide
          Raghu Angadi added a comment -

          Is there any open source project that uses Java security frame work for the security similar to we want to use? The current HDFS use case is to get user info for every RPC call.

          Show
          Raghu Angadi added a comment - Is there any open source project that uses Java security frame work for the security similar to we want to use? The current HDFS use case is to get user info for every RPC call.
          Tsz Wo Nicholas Sze made changes -
          Attachment simple20070828.patch [ 12364700 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -
          • design20070828.pdf
            is a Security Design document for the big picture
          • guides20070828.pdf
            is a API & Developer Guides document for Phase 1
          • 1701_20070827c_framework.patch
            is the framework API
          • simple20070828.patch
            is a simple implementation
          Show
          Tsz Wo Nicholas Sze added a comment - design20070828.pdf is a Security Design document for the big picture guides20070828.pdf is a API & Developer Guides document for Phase 1 1701_20070827c_framework.patch is the framework API simple20070828.patch is a simple implementation
          Tsz Wo Nicholas Sze made changes -
          Attachment design20070828.pdf [ 12364699 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment guides20070828.pdf [ 12364698 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment guides20070827.pdf [ 12364654 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070827c_framework.patch [ 12364655 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          add groups to the framework

          Show
          Tsz Wo Nicholas Sze added a comment - add groups to the framework
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070827b_framework.patch [ 12364643 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment guides20070827.pdf [ 12364654 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment guides20070822b.pdf [ 12364372 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070827b_framework.patch [ 12364643 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070827framework.patch [ 12364631 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070827framework.patch [ 12364631 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070823framework.patch [ 12364445 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070823framework.patch [ 12364445 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070822b_framework.patch [ 12364370 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment guides20070822b.pdf [ 12364372 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Updated javadoc and the pdf

          Show
          Tsz Wo Nicholas Sze added a comment - Updated javadoc and the pdf
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070822b_framework.patch [ 12364370 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070822framework.patch [ 12364362 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          1701_20070822framework.patch

          • Changed username mechanism and the default is reading from OS
          • removed getSubject()
          • renamed getPrincipal(...) to getIssuer(...)
          • Split SecurityImpl into two classes, Security and Impl. Impl has to be public since the subclasses usually are not in the same package.
          Show
          Tsz Wo Nicholas Sze added a comment - 1701_20070822framework.patch Changed username mechanism and the default is reading from OS removed getSubject() renamed getPrincipal(...) to getIssuer(...) Split SecurityImpl into two classes, Security and Impl. Impl has to be public since the subclasses usually are not in the same package.
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070822framework.patch [ 12364362 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070821framework.patch [ 12364257 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment guides20070822.pdf [ 12364352 ]
          Hide
          Doug Cutting added a comment -

          I agree with Owen that we should just have a Collector interface and a default implementation that reads from the config and/or OS. Also, isn't it more secure to use UnixSystem#getUserName() than the system property, since normal users cannot modify the former?

          My preference would be to have the default only use the OS. That would make it slightly harder for folks to pretend to be someone else without changing, e.g., JobClient or DFSClient, no? If it's just a config property then anyone can specify root on the command line. We should make the default a bit harder than that to beat, no?

          Show
          Doug Cutting added a comment - I agree with Owen that we should just have a Collector interface and a default implementation that reads from the config and/or OS. Also, isn't it more secure to use UnixSystem#getUserName() than the system property, since normal users cannot modify the former? My preference would be to have the default only use the OS. That would make it slightly harder for folks to pretend to be someone else without changing, e.g., JobClient or DFSClient, no? If it's just a config property then anyone can specify root on the command line. We should make the default a bit harder than that to beat, no?
          Tsz Wo Nicholas Sze made changes -
          Attachment guides20070822.pdf [ 12364352 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          1) I did not really use WritableFactory in the framework. I will revert it.

          2) getSubject(...) is useful for java.security API, for example, access control. It just initializes the principals to the Subject.

          3) getPrincipal(...) is really misleading: it means getIssuer(...)

          4) I just need somewhere to put the public methods, like hadoopLogin(...) . Of course, I can create a new class for that purpose.

          5) I want to provide a general mechanism to get username. We might only need a Collector interface and conf always specifies a subclass.

          Show
          Tsz Wo Nicholas Sze added a comment - 1) I did not really use WritableFactory in the framework. I will revert it. 2) getSubject(...) is useful for java.security API, for example, access control. It just initializes the principals to the Subject. 3) getPrincipal(...) is really misleading: it means getIssuer(...) 4) I just need somewhere to put the public methods, like hadoopLogin(...) . Of course, I can create a new class for that purpose. 5) I want to provide a general mechanism to get username. We might only need a Collector interface and conf always specifies a subclass.
          Hide
          Owen O'Malley added a comment -

          1. Please don't use WritableFactory. ReflectionUtils.newInstance() is the preferred interface.
          2. What would be the usage of Ticket.getSubject() ?
          3. I think that Ticket.getPrincipal() should be getTarget() to be clear about which Principal is returned.
          4. I suspect that SecurityImpl should not be a public class.
          5. I think that we should support a single mechanism to get the username. It looks like you are planning on a two level structure.

          Show
          Owen O'Malley added a comment - 1. Please don't use WritableFactory. ReflectionUtils.newInstance() is the preferred interface. 2. What would be the usage of Ticket.getSubject() ? 3. I think that Ticket.getPrincipal() should be getTarget() to be clear about which Principal is returned. 4. I suspect that SecurityImpl should not be a public class. 5. I think that we should support a single mechanism to get the username. It looks like you are planning on a two level structure.
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070821framework.patch [ 12364257 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          1701_20070821framework.patch contains only a general framework.

          Show
          Tsz Wo Nicholas Sze added a comment - 1701_20070821framework.patch contains only a general framework.
          Tsz Wo Nicholas Sze made changes -
          Attachment users.txt [ 12363744 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070815.patch [ 12363900 ]
          Tsz Wo Nicholas Sze made changes -
          Link This issue blocks HADOOP-1741 [ HADOOP-1741 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Below are my responses to the comments. Sorry for being late.

          For Dhruba's comments:

          (1) We will have a very flexible mechanism to obtain usernames. It will support

          • get the username from OS
          • get the username specified in conf
          • get username by an arbitrary rule

          I will let you know the details later.

          (2) Since UID is kind of system dependent, we will use username as parameter for intermediate communication. We also generate some serial numbers in NameNode for efficient storage. These serial numbers will be used in NameNode internally and are not visible outside NameNode.

          For Allen's comments:

          1. We are going to get ride authentication server and user management in the first phase. See also (2) below
          2. We will assume that when users run Hadoop clients, they are logged in to a network system (e.g. Unix). We use the user account and group information maintained by the network system. Then, we do not need any user/group management in Hadoop.
          3. See (2) in the response for Dhruba's comments.
          4. In the Hadoop 0.13, the files are stored in the home directories of each user. Then, the default owner of all files under a home directory (/home/XXXX) will be the user (i.e. XXXX). For the files not inside a home directory, it would be root.
          5. I agree. See also (1) in the response for Dhruba's comments.
          6. I plan to let administrator to setup a regular expression in conf.
          7. Currently, it is not an issue since we don't have user management. Our goal is to support at least 10k users/groups later on.
          Show
          Tsz Wo Nicholas Sze added a comment - Below are my responses to the comments. Sorry for being late. For Dhruba's comments: (1) We will have a very flexible mechanism to obtain usernames. It will support get the username from OS get the username specified in conf get username by an arbitrary rule I will let you know the details later. (2) Since UID is kind of system dependent, we will use username as parameter for intermediate communication. We also generate some serial numbers in NameNode for efficient storage. These serial numbers will be used in NameNode internally and are not visible outside NameNode. For Allen's comments: We are going to get ride authentication server and user management in the first phase. See also (2) below We will assume that when users run Hadoop clients, they are logged in to a network system (e.g. Unix). We use the user account and group information maintained by the network system. Then, we do not need any user/group management in Hadoop. See (2) in the response for Dhruba's comments. In the Hadoop 0.13, the files are stored in the home directories of each user. Then, the default owner of all files under a home directory (/home/XXXX) will be the user (i.e. XXXX). For the files not inside a home directory, it would be root. I agree. See also (1) in the response for Dhruba's comments. I plan to let administrator to setup a regular expression in conf. Currently, it is not an issue since we don't have user management. Our goal is to support at least 10k users/groups later on.
          Hide
          Christophe Taton added a comment -

          Some answers:
          2) To keep the changes small, we do not implement groups for now (owner + others only)
          3) I'll have a look on this. One concern during the design of this has been to provide an extensible infrastructure for authentication and authorization, with a first attempt to provide simplified POSIX permissions. Providing new/other authorization mechanisms is (to my mind) very easy with the given infrastructure (you have to extend the HFilePermission class that will be attached to INodes and to write your own Policy provider).
          4) When upgrading from a previous HDFS without permissions, the default is that files are owner by root and their permissions are rwx---rwx.
          Moreover, all entities (e.g. Fsck, JSP pages) will run as root in a first step.
          There are get/setPermissions() operation that use POSIXFilePermissions parameters (contains the owner id and the file mode).
          The sticky bit is also not implemented yet (this is a functionality I really need though).

          Show
          Christophe Taton added a comment - Some answers: 2) To keep the changes small, we do not implement groups for now (owner + others only) 3) I'll have a look on this. One concern during the design of this has been to provide an extensible infrastructure for authentication and authorization, with a first attempt to provide simplified POSIX permissions. Providing new/other authorization mechanisms is (to my mind) very easy with the given infrastructure (you have to extend the HFilePermission class that will be attached to INodes and to write your own Policy provider). 4) When upgrading from a previous HDFS without permissions, the default is that files are owner by root and their permissions are rwx---rwx. Moreover, all entities (e.g. Fsck, JSP pages) will run as root in a first step. There are get/setPermissions() operation that use POSIXFilePermissions parameters (contains the owner id and the file mode). The sticky bit is also not implemented yet (this is a functionality I really need though).
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070815.patch [ 12363900 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070813c.patch [ 12363743 ]
          Hide
          Allen Wittenauer added a comment -

          Some thoughts/concerns...

          1) While I realize the first pass was to not actually have passwords, I think it would be beneficial to at least have fields for passwords in such things as users.txt and have the actual authentication stuff just ignore the password field. People will want to automate, and not having the placeholders there will make life hard as things progress.

          2) If the plan is for POSIX-like, where is the default gid in this file as well? Where do group definitions live?

          3) Negative user ids are/were used in UNIX, and in particular, for nobody on NFS. Using something similar to NFSv4 ID Domains could deal with the issues of private/public username space without having to resort to uid reservations. Additionally, have we looked into support NFSv4-style ACLs? It would be good to be as compatible as possible with NFSv4 to open the door for a NFSv4.1 implementation down the road, IMHO.

          4) What's the plan for default ownership, etc, during upgrade? What's the equivalent of chown, chgrp, etc? Sticky bits?

          5) I'd like to keep the default user configurable. I can see where it would be useful to change it under certain conditions, especially when tied to something to like Kerberos.

          6) What are the 'rules' governing usernames? What are the legal characters? Length? What are the rules governing uids? [It would be good to look at POSIX for guidance here.]

          7) What happens when the user database gets large? What kind of memory footprint will be required per user? For example, what happens if we have 100,000 users defined?

          Show
          Allen Wittenauer added a comment - Some thoughts/concerns... 1) While I realize the first pass was to not actually have passwords, I think it would be beneficial to at least have fields for passwords in such things as users.txt and have the actual authentication stuff just ignore the password field. People will want to automate, and not having the placeholders there will make life hard as things progress. 2) If the plan is for POSIX-like, where is the default gid in this file as well? Where do group definitions live? 3) Negative user ids are/were used in UNIX, and in particular, for nobody on NFS. Using something similar to NFSv4 ID Domains could deal with the issues of private/public username space without having to resort to uid reservations. Additionally, have we looked into support NFSv4-style ACLs? It would be good to be as compatible as possible with NFSv4 to open the door for a NFSv4.1 implementation down the road, IMHO. 4) What's the plan for default ownership, etc, during upgrade? What's the equivalent of chown, chgrp, etc? Sticky bits? 5) I'd like to keep the default user configurable. I can see where it would be useful to change it under certain conditions, especially when tied to something to like Kerberos. 6) What are the 'rules' governing usernames? What are the legal characters? Length? What are the rules governing uids? [It would be good to look at POSIX for guidance here.] 7) What happens when the user database gets large? What kind of memory footprint will be required per user? For example, what happens if we have 100,000 users defined?
          Hide
          dhruba borthakur added a comment -

          I browsed the API description. Looks good. Minor comments:

          1. I wonder if the default user-name should be hard-coded into the code rather that it being a configuration variable. Currently, you have login.username to define the default anonymous username.

          2. Maybe the special uids be negative values (instead of reserving < 1000000) This will allow us to make the hadoop-uids match with the user's unix uids. Unix-uids typically are positive integers. It might also facilitate easy integration with most LDAP installations.

          Show
          dhruba borthakur added a comment - I browsed the API description. Looks good. Minor comments: 1. I wonder if the default user-name should be hard-coded into the code rather that it being a configuration variable. Currently, you have login.username to define the default anonymous username. 2. Maybe the special uids be negative values (instead of reserving < 1000000) This will allow us to make the hadoop-uids match with the user's unix uids. Unix-uids typically are positive integers. It might also facilitate easy integration with most LDAP installations.
          Tsz Wo Nicholas Sze made changes -
          Attachment users.txt [ 12363744 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          User DB can be initialized by uploading a text file. users.txt is a sample.

          Show
          Tsz Wo Nicholas Sze added a comment - User DB can be initialized by uploading a text file. users.txt is a sample.
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070813c.patch [ 12363743 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          modified the bin/hadoop script so that simple.HAC and UserAdmin can be executed by

          • ./bin/hadoop hac
          • ./bin/hadoop useradmin
          Show
          Tsz Wo Nicholas Sze added a comment - modified the bin/hadoop script so that simple.HAC and UserAdmin can be executed by ./bin/hadoop hac ./bin/hadoop useradmin
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070813b.patch [ 12363739 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070813b.patch [ 12363739 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          The previous uploaded file is not correct. Please see 1701_20070813b.patch

          Show
          Tsz Wo Nicholas Sze added a comment - The previous uploaded file is not correct. Please see 1701_20070813b.patch
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070813.patch [ 12363732 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070813.patch [ 12363732 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -
          • Added UserAdmin
          • Updated SimpleHAC junit tests

          Still missing:

          • junit tests for UserAdmin
          • Add login to some components like FsShell
          Show
          Tsz Wo Nicholas Sze added a comment - Added UserAdmin Updated SimpleHAC junit tests Still missing: junit tests for UserAdmin Add login to some components like FsShell
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070809.patch [ 12363555 ]
          Hide
          dhruba borthakur added a comment -

          We would like to see a document that lists out the design and usage model of this approach. In particular:

          1. Upgrade process from current disk format to the new
          2. How does a dfs user specify to the dfs client what user he/she is. Does the dfs shell pick the user from environment variables or from a conf file?
          3. default permissions for files
          4. are there permissions with file, directories or both?
          5. where is the user-database stored? is it a flat file? where is the file located?is it versioned?
          6. The namenode can be configured to store fsimage in multiple directories. Does this affect where user-db is stored? how do we handle HAC failure?
          6. how does the administrator modify the user database? do changes have immediate effect?

          Show
          dhruba borthakur added a comment - We would like to see a document that lists out the design and usage model of this approach. In particular: 1. Upgrade process from current disk format to the new 2. How does a dfs user specify to the dfs client what user he/she is. Does the dfs shell pick the user from environment variables or from a conf file? 3. default permissions for files 4. are there permissions with file, directories or both? 5. where is the user-database stored? is it a flat file? where is the file located?is it versioned? 6. The namenode can be configured to store fsimage in multiple directories. Does this affect where user-db is stored? how do we handle HAC failure? 6. how does the administrator modify the user database? do changes have immediate effect?
          Tsz Wo Nicholas Sze made changes -
          Attachment 1701_20070809.patch [ 12363555 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          You can find the security framework and a simple implementation in 1701_20070809.patch but it is not yet completed:

          • Need a shell for user management
          • Add login to some components like FsShell
          • Change the junit test.
          Show
          Tsz Wo Nicholas Sze added a comment - You can find the security framework and a simple implementation in 1701_20070809.patch but it is not yet completed: Need a shell for user management Add login to some components like FsShell Change the junit test.
          Tsz Wo Nicholas Sze made changes -
          Link This issue is related to HADOOP-1298 [ HADOOP-1298 ]
          Tsz Wo Nicholas Sze made changes -
          Field Original Value New Value
          Link This issue relates to HADOOP-1298 [ HADOOP-1298 ]
          Tsz Wo Nicholas Sze created issue -

            People

            • Assignee:
              Unassigned
              Reporter:
              Tsz Wo Nicholas Sze
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development