Thanks for reviewing and providing feedback on the design. You have asked some good questions so let me try to add some more context on the design choices and why we made them. Hopefully this additional context will shed some clarity. Please feel free to ask if you still have questions or concerns.
> 1. The new diagram (p. 3) that describes client/TAS/AS/IdP/Hadoop Services interaction shows a client providing credentials to TAS, which then provides the credentials to the IdP. From a security perspective, this seems like a bad idea. It defeats the purpose of having an IdP in the first place. Is this an oversight or by design?
From client point of view, the TAS should be trusted by client for authentication, whether or not client credentials can be passed to TAS directly depends on the IdP’s capability and the deployment decisions etc. If IdP can generate a token and is federated with TAS, then the token can be used to authenticate with TAS to generate identity token in Hadoop cluster. If IdP does not have the capability of generate trusted token e.g. LDAP, then there can be several alternate solutions that depends on the deployment scenario.
The first scenario is TAS and IdP are deployed in the same organization in the same network, TAS can access IdP directly, in this scenario credentials are passed to TAS securely (over ssl) and then TAS pass the credential to IdP like LDAP. The second scenario is TAS and IdP are deployed separately in different network, TAS cannot contact the IdP directly, for example LDAP server is resident inside of enterprise and TAS is deployed in the cloud, and client is trying to access cluster from enterprise. In this scenario, an agent trusted by client can be deployed to collect client credentials, pass them to LDAP (aka the IdP), and generate token to external TAS to complete the authentication process. This agent can be another TAS as well. The third scenario is similar to the second scenario but the only difference is client is trying to access cluster from public network for example cloud environment, but need to used enterprise LDAP as IdP. In this scenario, an agent (can be TAS) needs to be deployed as gateway on the enterprise side to collect credentials.
In any of the above scenario, for an IdP without the capability to generate token as a result of the authentication, TAS can be the agent trusted by client to collect credentials for first mile authentication. As a result of above consideration, we draw the flow as it shows in page 3.
> 2. I'm not sure I understand why AS is necessary. It seems to complicate the design by adding an unnecessary authorization check - authorization can/should happen at individual Hadoop services based on token attributes. I think you have mentioned before that authorization (with AS in place) would happen at both places (some level of authz at AS and finer grained authz at services). Can you elaborate on what value that adds over doing authz at services only? And, can you provide an example of what authz checks would happen at each place? (Say I access NameNode. What authz checks are done at AS and what is done at the service?)
I would like to agree with you that authorization can be pushed into service side but having a centralized authorization has some advantages. For example: any authZ policy changes can be enforced immediately instead of waiting for the policy sync to each service. This also provides a centralized place for auditing client access. The centralized authZ acts much like the service level authZ except it’s centralized for reasons I just mentioned. (In the scenario you mentioned, if you went to access HDFS service, you need to have access token granted with authZ policy defined, once you have the access token you have access to the HDFS service but that does not mean you can access any file in HDFS, the file/directory level access control is done by HDFS itself.)
> 3. I believe this has been mentioned before, but the scope of this document makes it very difficult to move forward with contributing code. It would be very helpful to understand how you envision breaking this down into work items that the community can pick up (I think this is what the DISCUSS thread on common-dev was attempting to do).
This one I am trying to understand a little better. Please help me understand what you mean by “… scope of this document makes it very difficult to move forward with contributing code.”? If we were to breakdown the jira in to a number of sub-tasks based on the document would that be helpful?