The way Hadoop currently works, MapReduce tasks (and similarly anything in YARN containers) does not have access to Kerberos tickets. Instead, it uses DIGEST-MD5 authentication scheme. This is explained in great detail here: http://carfield.com.hk/document/distributed/hadoop-security-design.pdf
The gist is that when the job is started, the job client obtains a signed "delegation token" and secret "token authenticator" from the service (typically NN or JT). These are distributed to the mappers (or containers) through a credentials cache. The server keeps part of the secret (typically in NN/JT memory) and uses that to authenticate the clients.
In order for Kafka to support this authentication scheme, we need to:
1) Be able to generate the token
2) Be able to store the Broker half of the secret in a way that is secured but accessible to all brokers (probably ZK)
3) Be able to authenticate clients using the tokens
Other goals I see are:
1) Reuse Hadoop code to avoid re-inventing the wheel, but otherwise minimize dependencies
2) If at all possible, without patching Hadoop, we want this to be transparent to existing MR jobs (especially Camus). If not possible, we should provide an easy-to-use authentication API to minimize the required changes.
Open question: Do we need to support token renewal?
I'm working on a design doc for this part, but it will probably only happen after Strata.