The attached patches (this JIRA &
HDFS-1604 & MAPREDUCE-2287) implement authentication in the following manner:
A HadoopAuthenticationFilter (A servlet filter) is configured in front of all Hadoop web console JSPs.
This filter verifies if the incoming request is already authenticated (by the presence of a signed HTTP cookie).
If the cookie is present, its signature is valid and its value didn't expire; then the request continues its way to the page invoked by the request.
If the cookie is not present, it is invalid or it expired; then the request is delegated to an authenticator handler. The authenticator handler then is responsible for requesting/validating the user-agent for the user credentials. This may require one or more additional interactions between the authenticator handler and the user-agent (which will be multiple HTTP requests). Once the authenticator handler verifies the credentials and generates an authentication token, a signed cookie is returned to the user-agent for all subsequent invocations.
The authenticator handler is pluggable and 2 implementations are provided out o the box: pseudo/simple and kerberos.
The pseudo/simple authenticator handler is equivalent to the Hadoop pseudo/simple authentication. It trusts the value of the user.name query string parameter.
The pseudo/simple authenticator handler supports an anonymous mode which accepts any request without requiring the user.name query string parameter to create the token (this is the default behavior, preserving the behavior of the Hadoop web-consoles before this patch).
The kerberos authenticator handler implements the Kerberos HTTP SPNEGO implementation. This authenticator handler will generate a token only if a successful Kerberos HTTP SPNEGO interaction is performed between the user-agent and the authenticator. Browsers like Firefox and Internet Explorer support Kerberos HTTP SPNEGO.
To use the kerberos authenticator handler an HTTP service kerberos principal is required (HTTP/$HOSTNAME@$REALM) and its credentials must be stored in a keytab file (most likely it would be the same keytab file used by the node for JT/NN/DN/TT credentials if Hadoop Kerberos authentication is on).
To support an additional authentication mechanism, an authenticator handler implementation must be written. The authentication handler is a simple interface with 3 methods (init/destroy/authenticate).
The HadoopAuthenticationFilter extends Alfredo AuthenticationFilter overriding the getConfiguration() method to load the configuration from Hadoop conf/ directory (via the Configuration class).
Alfredo (http://cloudera.github.com/alfredo) is an HTTP client/server authentication framework. Alfredo is distributed under the Apache License, it is fully documented and it has comprehensive test cases.
As the question may come, the motivation for doing the authentication framework as a separate project (Alfredo) was:
- Other Hadoop related projects can start using it today without having to wait for a Hadoop release
- It has applicability outside of Hadoop