Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.19.0
    • Fix Version/s: 0.21.0
    • Component/s: security
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently the IPC client sends the UGI which contains the user/group information for the Server. However this represents the groups for the user on the client-end. The more pertinent mapping from user to groups is actually the one seen by the Server. Hence the client should only send the user and we should add a 'group mapping service' so that the Server can query it for the mapping.

      1. MR-1083-0_20.2.patch
        59 kB
        Jitendra Nath Pandey
      2. HADOOP-4656-7.patch
        23 kB
        Boris Shkolnik
      3. HADOOP-4656-6.patch
        22 kB
        Boris Shkolnik
      4. HADOOP-4656-6.patch
        23 kB
        Boris Shkolnik
      5. HADOOP-4656-4.patch
        20 kB
        Boris Shkolnik
      6. HADOOP-4656-2.patch
        19 kB
        Boris Shkolnik
      7. HADOOP-4656-1.patch
        18 kB
        Boris Shkolnik
      8. HADOOP-4656.patch
        14 kB
        Arun C Murthy
      9. HADOOP-4656_0_20090108.patch
        7 kB
        Arun C Murthy

        Issue Links

          Activity

          Hide
          Arun C Murthy added a comment -

          HADOOP-4348 is switching IPC to use the JAAS Subject rather than UGI (which will become an internal artifact). While we are adding the user-to-group mapping service, I propose we change the IPC Client to send the JAAS Subject in the header rather than UGI, this will also be compatible with the way we will do Kerberos-based authentication via the GSS API.

          Show
          Arun C Murthy added a comment - HADOOP-4348 is switching IPC to use the JAAS Subject rather than UGI (which will become an internal artifact). While we are adding the user-to-group mapping service, I propose we change the IPC Client to send the JAAS Subject in the header rather than UGI, this will also be compatible with the way we will do Kerberos-based authentication via the GSS API.
          Hide
          Arun C Murthy added a comment -

          I propose a new abstract class Groups with a single method 'getGroups' as below:

          Groups.java
          public abstract class Groups {
            List<String> getGroups(String username);
          }
          

          with a concrete implementation which gets the unix groups for the given user.

          Show
          Arun C Murthy added a comment - I propose a new abstract class Groups with a single method 'getGroups' as below: Groups.java public abstract class Groups { List< String > getGroups( String username); } with a concrete implementation which gets the unix groups for the given user.
          Hide
          Arun C Murthy added a comment -

          Preliminary patch while I continue testing.

          Show
          Arun C Murthy added a comment - Preliminary patch while I continue testing.
          Hide
          Kan Zhang added a comment -

          > I propose we change the IPC Client to send the JAAS Subject in the header rather than UGI, this will also be compatible with the way we will do Kerberos-based authentication via the GSS API.

          Just want to clarify that application code doesn't send anything when using Kerberos. It's all hiding inside the GSS API library. After authentication, server can query the established GSS context to get client ID as GSSName which can be converted to a String. So for compatibility, IPC Client doesn't have to send JAAS Subject in the header. Send a String is fine.

          Show
          Kan Zhang added a comment - > I propose we change the IPC Client to send the JAAS Subject in the header rather than UGI, this will also be compatible with the way we will do Kerberos-based authentication via the GSS API. Just want to clarify that application code doesn't send anything when using Kerberos. It's all hiding inside the GSS API library. After authentication, server can query the established GSS context to get client ID as GSSName which can be converted to a String. So for compatibility, IPC Client doesn't have to send JAAS Subject in the header. Send a String is fine.
          Hide
          Allen Wittenauer added a comment -

          Groups should definitely come from asking the host OS in some form using the Java equivalent of getgrent() and friends. [ Be aware that getgroups() is BSD-specific and may not be available on System V, such as Solaris and HP-UX.] Doing this via shell call out is just going to exasperate the memory problems we already see, especially on the secondary name node that requires more memory than the primary due to the fork of whoami/id!

          It also opens up yet another security hole where any random groups command on the name nodes path can be used to override. Not Good(tm).

          Show
          Allen Wittenauer added a comment - Groups should definitely come from asking the host OS in some form using the Java equivalent of getgrent() and friends. [ Be aware that getgroups() is BSD-specific and may not be available on System V, such as Solaris and HP-UX.] Doing this via shell call out is just going to exasperate the memory problems we already see, especially on the secondary name node that requires more memory than the primary due to the fork of whoami/id! It also opens up yet another security hole where any random groups command on the name nodes path can be used to override. Not Good(tm).
          Hide
          Allen Wittenauer added a comment -

          Privately, someone asked about caching the group content.

          One of the big advantages of talking to the OS is that many systems include a naming services caching daemon that handles caching group and similar content for the entire machine. nscd generally includes great support for controlling the size, ttl, negative ttl, etc, for the cache. Duplicating that functionality seems like overkill and, worse, will act as a cache against a cache!

          Show
          Allen Wittenauer added a comment - Privately, someone asked about caching the group content. One of the big advantages of talking to the OS is that many systems include a naming services caching daemon that handles caching group and similar content for the entire machine. nscd generally includes great support for controlling the size, ttl, negative ttl, etc, for the cache. Duplicating that functionality seems like overkill and, worse, will act as a cache against a cache!
          Hide
          Kan Zhang added a comment -

          Arun, can we get this one done soon? I'm working on 4343, which depends on this. Thanks.

          Show
          Kan Zhang added a comment - Arun, can we get this one done soon? I'm working on 4343, which depends on this. Thanks.
          Hide
          FROHNER Ákos added a comment -

          Please consider passing the authentication context to the getGroups() method,
          as it might be easier to retrieve the associated groups using that information,
          then based only on the username.

          For example in POSIX environments it is faster to do a lookup based on the
          numeric UID, than based on the username.

          If you are using Kerberos with PAC, then the authentication context may already
          contain a list of associated groups:
          http://k5wiki.kerberos.org/wiki/Projects/PAC_and_principal_APIs

          There is a similar solution based on X509 authentication, where the associated
          list of groups is embedded into the authentication context.

          Show
          FROHNER Ákos added a comment - Please consider passing the authentication context to the getGroups() method, as it might be easier to retrieve the associated groups using that information, then based only on the username. For example in POSIX environments it is faster to do a lookup based on the numeric UID, than based on the username. If you are using Kerberos with PAC, then the authentication context may already contain a list of associated groups: http://k5wiki.kerberos.org/wiki/Projects/PAC_and_principal_APIs There is a similar solution based on X509 authentication, where the associated list of groups is embedded into the authentication context.
          Hide
          Allen Wittenauer added a comment -

          AFAIK, Hadoop has no concept of uid, as everything in the HDFS, etc, is stored as a string. So the username is about all the context you can probably get.

          Show
          Allen Wittenauer added a comment - AFAIK, Hadoop has no concept of uid, as everything in the HDFS, etc, is stored as a string. So the username is about all the context you can probably get.
          Hide
          Arun C Murthy added a comment -

          Preliminary patch, with some testing done.

          Show
          Arun C Murthy added a comment - Preliminary patch, with some testing done.
          Hide
          Boris Shkolnik added a comment -

          This patch will create two versions of SecurityUtil.getSubject. One that builds list of group principles from UGI group list and another one that builds the list from UNIX id command. Do we really need the first one? I suggest we remove it.

          Show
          Boris Shkolnik added a comment - This patch will create two versions of SecurityUtil.getSubject. One that builds list of group principles from UGI group list and another one that builds the list from UNIX id command. Do we really need the first one? I suggest we remove it.
          Hide
          Boris Shkolnik added a comment -

          left both getSubject in place.
          added test for the new getSubject(user)

          Show
          Boris Shkolnik added a comment - left both getSubject in place. added test for the new getSubject(user)
          Hide
          Boris Shkolnik added a comment -

          Removed Timer thread for refreshing cache of groups mapping. Instead using lazy approach - verifying cacheTimeout on access.

          Also implemented few comments form Jakob:
          1. removed author tag
          2. using CommonConfigurationKeys constants for conf.keys.

          Show
          Boris Shkolnik added a comment - Removed Timer thread for refreshing cache of groups mapping. Instead using lazy approach - verifying cacheTimeout on access. Also implemented few comments form Jakob: 1. removed author tag 2. using CommonConfigurationKeys constants for conf.keys.
          Hide
          Jakob Homan added a comment -

          Reviewed patch:

          • Nit: Calling an abstract class GroupMappingImpl seems a bit odd, even if it is technically correct for this. Service provider, maybe?
          • In Groups.java the previous timer-based code is still present, but commented out. Needs removed.
          • Note: HADOOP-6299, if added as-is from the draft posted, will introduce code duplication in terms of executing the shell. When that code is reviewed, we should try to eliminate that.
          • In the unit test, principal is spelled as principle.
          • In the second-to-last line of the unit test, there is a spelling error of subject.
          • The provided unit test is very happy pathy. It'd be great if there were more testing of failures. Gary suggested testing what happens if we pass a user name that doesn't exist.
          Show
          Jakob Homan added a comment - Reviewed patch: Nit: Calling an abstract class GroupMappingImpl seems a bit odd, even if it is technically correct for this. Service provider, maybe? In Groups.java the previous timer-based code is still present, but commented out. Needs removed. Note: HADOOP-6299 , if added as-is from the draft posted, will introduce code duplication in terms of executing the shell. When that code is reviewed, we should try to eliminate that. In the unit test, principal is spelled as principle. In the second-to-last line of the unit test, there is a spelling error of subject. The provided unit test is very happy pathy. It'd be great if there were more testing of failures. Gary suggested testing what happens if we pass a user name that doesn't exist.
          Hide
          Jakob Homan added a comment -

          Of note, with the patch applied I'm seeing a time-out for TestIPC and a failure of TestIPCServerResponder:testServerResponder (null value encountered). I'm not seeing this without the patch applied. We should investigate why this is happening before submitting an updated patch.

          Show
          Jakob Homan added a comment - Of note, with the patch applied I'm seeing a time-out for TestIPC and a failure of TestIPCServerResponder:testServerResponder (null value encountered). I'm not seeing this without the patch applied. We should investigate why this is happening before submitting an updated patch.
          Hide
          Boris Shkolnik added a comment -

          implemented review's comments and fixed the tests

          Show
          Boris Shkolnik added a comment - implemented review's comments and fixed the tests
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12427260/HADOOP-4656-4.patch
          against trunk revision 887472.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/172/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/172/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/172/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/172/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12427260/HADOOP-4656-4.patch against trunk revision 887472. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/172/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/172/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/172/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/172/console This message is automatically generated.
          Hide
          Boris Shkolnik added a comment -

          for fakeUser - do not fail NameNode. Return a valid Subject object with 0 lengths groups list.
          Adjusted TestUnixGroupInformation test to this change

          Show
          Boris Shkolnik added a comment - for fakeUser - do not fail NameNode. Return a valid Subject object with 0 lengths groups list. Adjusted TestUnixGroupInformation test to this change
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12427961/HADOOP-4656-6.patch
          against trunk revision 890502.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/203/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/203/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/203/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/203/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12427961/HADOOP-4656-6.patch against trunk revision 890502. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/203/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/203/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/203/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/203/console This message is automatically generated.
          Hide
          Jakob Homan added a comment -

          Reviewed v6 of patch:

          • It may be a good idea to explicitly document that getGroups (both in GroupMappingServiceProvider and Groups) will return an empty list for a non-existent user. This avoids any future null-related confusion.
          • Comment on RefreshUserToGroupMappingProtocol seems incomplete
          • "just return the emptly list" - empty spelled incorrectly
          • The change to the exception message in setUserGroupNames no longer reflects the fact that userName shouldn't be a zero-length string
          • In TestUnixUserGroupInformation,
            testConstructorFailures(USER_NAME, new String[0]); // valid case now

            is commented out, and should be removed.

          • In the negative test, an extra assert that getPrincipals is zero-length wouldn't hurt.
          • fail("fakeUser should have no grups");

            - groups is spelled incorrectly

          Show
          Jakob Homan added a comment - Reviewed v6 of patch: It may be a good idea to explicitly document that getGroups (both in GroupMappingServiceProvider and Groups) will return an empty list for a non-existent user. This avoids any future null-related confusion. Comment on RefreshUserToGroupMappingProtocol seems incomplete "just return the emptly list" - empty spelled incorrectly The change to the exception message in setUserGroupNames no longer reflects the fact that userName shouldn't be a zero-length string In TestUnixUserGroupInformation, testConstructorFailures(USER_NAME, new String [0]); // valid case now is commented out, and should be removed. In the negative test, an extra assert that getPrincipals is zero-length wouldn't hurt. fail( "fakeUser should have no grups" ); - groups is spelled incorrectly
          Hide
          Jakob Homan added a comment -

          Almost forgot: is there any reason GroupMappingServiceProvider is an abstract class rather than an interface? It doesn't have any implemented methods...

          Show
          Jakob Homan added a comment - Almost forgot: is there any reason GroupMappingServiceProvider is an abstract class rather than an interface? It doesn't have any implemented methods...
          Hide
          Boris Shkolnik added a comment -

          implemented comments from Jakob.

          Show
          Boris Shkolnik added a comment - implemented comments from Jakob.
          Hide
          Jakob Homan added a comment -

          Looks good. +1 pending Hudson.

          Show
          Jakob Homan added a comment - Looks good. +1 pending Hudson.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12428248/HADOOP-4656-7.patch
          against trunk revision 891511.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/31/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/31/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/31/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/31/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12428248/HADOOP-4656-7.patch against trunk revision 891511. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/31/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/31/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/31/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/31/console This message is automatically generated.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #121 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/121/)
          . Add a user to groups mapping service (boryas and acmurthy_)

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #121 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/121/ ) . Add a user to groups mapping service (boryas and acmurthy_)
          Hide
          Devaraj Das added a comment -

          Marking the issue as resolved.

          Show
          Devaraj Das added a comment - Marking the issue as resolved.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk #191 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/191/)
          . Add a user to groups mapping service (boryas and acmurthy_)

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk #191 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/191/ ) . Add a user to groups mapping service (boryas and acmurthy_)
          Hide
          Jitendra Nath Pandey added a comment -

          Combined patch for HADOOP-4656, HDFS-685 and MAPREDUCE-1083 for Hadoop-20.

          Show
          Jitendra Nath Pandey added a comment - Combined patch for HADOOP-4656 , HDFS-685 and MAPREDUCE-1083 for Hadoop-20.

            People

            • Assignee:
              Boris Shkolnik
              Reporter:
              Arun C Murthy
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development