Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-15222

Refine proxy user authorization to support multiple ACL list

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.0.0
    • None
    • security
    • None

    Description

      This Jira is responding to follow up work for HADOOP-14077. The original goal of HADOOP-14077 is to have ability to support multiple ACL lists. The original problem is a separation of duty use case where the Hadoop cluster hosting company monitors Hadoop cluster through jmx. Application logs and hdfs contents should not be visible to hosting company system administrators. When checking for proxy user authorization in AuthenticationFilter to ensure there is a way to authorize normal users and admin users using separate proxy users ACL lists. This was suggested in HADOOP-14060 to configure AuthenticationFilterWithProxyUser this way:

      AuthenticationFilterWithProxyUser->StaticUserWebFilter->AuthenticationFIlterWithProxyUser

      This enables the second AuthenticationFilterWithProxyUser validates both credentials claim by proxy user, and end user.

      However, there is a side effect that unauthorized users are not properly rejected with 403 FORBIDDEN message if there is no other web filter configured to handle the required authorization work.

      This JIRA is intend to discuss the work of HADOOP-14077 by either combine StaticUserWebFilter + second AuthenticationFilterWithProxyUser into a AuthorizationFilterWithProxyUser as a final filter to evict unauthorized user, or revert both HADOOP-14077 and HADOOP-13119 to eliminate the false positive in user authorization and impersonation.

      Attachments

        Issue Links

          Activity

            lmccay Larry McCay added a comment -

            eyang - IMO, we need to revert both HADOOP-14077 and HADOOP-13119 and then determine whether to address the original issue.

            Let's please be clear on what that problem is - can you verify whether the following summarizes it properly?

            1. There are deployments that only allow access through a single proxy entry point
            2. Some resources that are accessible through proxies should only be accessible for admins
            3. Proxyuser enforcement is generally used to restrict proxies from impersonating admins and super users for obvious reasons

            Due to the paradox created by the facts in 2 and 3 above we have the following situation, we need to decide whether we should either:

            1. Disable certain paths for proxy users as they are intended only for direct access by authenticated users and deployments described in #1 above are out of luck
            2. Open the proxyuser enforcement rules to allow admin access for specific paths

            Personally, I don't believe that the fact that certain resources can't be accessed in deployments that only allow impersonation means that we should redefine the proxyuser enforcement strength.

            I think that it is valid to consider strengthening the proxyuser enforcement to deny access to specific sensitive resources.

            Whether or not certain resources are too sensitive for impersonation can be left up to the deployment.

            lmccay Larry McCay added a comment - eyang - IMO, we need to revert both HADOOP-14077 and HADOOP-13119 and then determine whether to address the original issue. Let's please be clear on what that problem is - can you verify whether the following summarizes it properly? There are deployments that only allow access through a single proxy entry point Some resources that are accessible through proxies should only be accessible for admins Proxyuser enforcement is generally used to restrict proxies from impersonating admins and super users for obvious reasons Due to the paradox created by the facts in 2 and 3 above we have the following situation, we need to decide whether we should either: Disable certain paths for proxy users as they are intended only for direct access by authenticated users and deployments described in #1 above are out of luck Open the proxyuser enforcement rules to allow admin access for specific paths Personally, I don't believe that the fact that certain resources can't be accessed in deployments that only allow impersonation means that we should redefine the proxyuser enforcement strength. I think that it is valid to consider strengthening the proxyuser enforcement to deny access to specific sensitive resources. Whether or not certain resources are too sensitive for impersonation can be left up to the deployment.
            eyang Eric Yang added a comment -

            lmccay Thank you for the summary. This is aligned with the original problem statement. Role based ACL in standard J2EE web application would be the right approach to solve the authorization problem. User can describe in web.xml which url resource are allowed by roles. Roles are mapped to groups of users. It would be nice to do the same in Hadoop. Hadoop web applications don't quite follow J2EE design pattern. This made the problem hard to solve for Hadoop. We can start by turning Hadoop jetty Java code back to configuration, and maps to roles. In doing so, we might finish in 2-3 years of hard labour. There might be better ways to resolve this issue that we need to explore.

            HADOOP-13119 is back ported to Hadoop 2.8.x as a new feature in Hadoop 2.8. Do we revert HADOOP-13119 from 2.8.x or we keep HADOOP-13119 as the temp solution until the new work is completed?

            eyang Eric Yang added a comment - lmccay Thank you for the summary. This is aligned with the original problem statement. Role based ACL in standard J2EE web application would be the right approach to solve the authorization problem. User can describe in web.xml which url resource are allowed by roles. Roles are mapped to groups of users. It would be nice to do the same in Hadoop. Hadoop web applications don't quite follow J2EE design pattern. This made the problem hard to solve for Hadoop. We can start by turning Hadoop jetty Java code back to configuration, and maps to roles. In doing so, we might finish in 2-3 years of hard labour. There might be better ways to resolve this issue that we need to explore. HADOOP-13119 is back ported to Hadoop 2.8.x as a new feature in Hadoop 2.8. Do we revert HADOOP-13119 from 2.8.x or we keep HADOOP-13119 as the temp solution until the new work is completed?
            lmccay Larry McCay added a comment -

            Revert it from all branches and put it back to proper proxyuser rules enforcement.

            Also, we should add the block of a configurable set of resources so that they can't be accessed via impersonation since impersonation isn't intended for admin users and some resources may be considered sensitive enough to limit to admins.

            We can then have a discussion on whether we want to extend impersonated to admins or not.

            I personally don't think we should but perhaps it can be controlled enough with proper config.

             

            lmccay Larry McCay added a comment - Revert it from all branches and put it back to proper proxyuser rules enforcement. Also, we should add the block of a configurable set of resources so that they can't be accessed via impersonation since impersonation isn't intended for admin users and some resources may be considered sensitive enough to limit to admins. We can then have a discussion on whether we want to extend impersonated to admins or not. I personally don't think we should but perhaps it can be controlled enough with proper config.  
            eyang Eric Yang added a comment -

            lmccay Sorry, until a better proposal is feasible to secure /log and /jmx, there is no good enough reason to justify the revert of HADOOP-13119. arpitagarwal's report was not valid on HADOOP-13119, and HADOOP-13119 does provide better security for authorized users than anonymous to access /log. I can not agree on the revert on HADOOP-13119 at this time.

            eyang Eric Yang added a comment - lmccay Sorry, until a better proposal is feasible to secure /log and /jmx, there is no good enough reason to justify the revert of HADOOP-13119 . arpitagarwal 's report was not valid on HADOOP-13119 , and HADOOP-13119 does provide better security for authorized users than anonymous to access /log. I can not agree on the revert on HADOOP-13119 at this time.
            eyang Eric Yang added a comment - - edited

            Today, hadoop offers two roles, cluster admin, and normal users. New system monitor role might be required for separation of duty for service hosting companies. The following table shows a rough sketch of roles required to map to Hadoop web applications:

            HDFS

            /logs cluster admin
            /jmx system monitor
            /conf cluster admin
            /stacks system monitor

            YARN

            /logs cluster admin
            /jmx system monitor
            /conf cluster admin

            This separation will prevent leaks of customer information.

            eyang Eric Yang added a comment - - edited Today, hadoop offers two roles, cluster admin, and normal users. New system monitor role might be required for separation of duty for service hosting companies. The following table shows a rough sketch of roles required to map to Hadoop web applications: HDFS /logs cluster admin /jmx system monitor /conf cluster admin /stacks system monitor YARN /logs cluster admin /jmx system monitor /conf cluster admin This separation will prevent leaks of customer information.

            People

              Unassigned Unassigned
              eyang Eric Yang
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated: