Short description of the Zookeeper kerberos architecture we made for Mahadev Konar
I will describe our architecture using the workflow of client authentication
1. Client inits a connection
2. Client adds a kerberos authentication info using the new function I've added
zoo_add_auth_cb(zoohandle, "kerberos", authcallback, authcontext, completioncallback, data)
where authcallback is a function which is able to get an authentication token for the connection.
The signature of that callback is the following: typedef struct buffer (get_auth_cert_t)(const char *hostname, const void ctx);
This means it returns the token in the zookeeper buffer structure.
The callback function gets the hostname where the client will actually connect to and the context which were originally
passed to zoo_add_auth_cb. The context is not used by the c kerberos code but it might come in handy for other
protocols and it is needed for our planned python binding.
3. At this point the auth_info is just stored at the client side. I've made modifications in the client to call the registered callback every time before actually sending the auth_info to the server to ensure the token is not dated. In this callback basically I use the kerberos gss api to initialize a secure context (gss_init_sec_context) based on the host of the server where the client actually connects to. The hostname is required because we do mutual authentication so only the correct host can decode the token that the client sends.
4. On the server side we made a kerberos authentication plugin using the authentication plugin mechanism of the zookeeper server. That authentication plugin will receive the token and decode the userid of the client from that. Later the plugin can use this userid to make ACL decisions as well. Basically it maintains a map of sessions, users.
5. To make really use of this authentication the zookeeper nodes should have kerberos ACLs. We actually implemented different algorithms to validate an operation. Basically the plugin have the userid so it can make a decision purely based on that or use this userid and acl of the node to propogate the decision to a service or rule engine or anything else.
Let me know if you have further questions