Hadoop KMS is the gateway, for Hadoop and Hadoop clients, to the underlying KMS. It provides an interface that works with existing Hadoop security components (authenticatication, confidentiality).
Hadoop KMS will provide an additional implementation of the Hadoop KeyProvider class. This implementation will be a client-server implementation.
The client-server protocol will be secure:
- Kerberos HTTP SPNEGO (authentication)
- HTTPS for transport (confidentiality and integrity)
- Hadoop ACLs (authorization)
The Hadoop KMS implementation will not provide additional ACL to access encrypted files. For sophisticated access control requirements, HDFS ACLs (
HDFS-4685) should be used.
Basic key administration will be supported by the Hadoop KMS via the, already available, Hadoop KeyShell command line tool
There are minor changes that must be done in Hadoop KeyProvider functionality:
The KeyProvider contract, and the existing implementations, must be thread-safe
KeyProvider API should have an API to generate the key material internally
JavaKeyStoreProvider should use, if present, a password provided via configuration
KeyProvider Option and Metadata should include a label (for easier cross-referencing)
To avoid overloading the underlying KeyProvider implementation, the Hadoop KMS will cache keys using a TTL policy.
Scalability and High Availability of the Hadoop KMS can achieved by running multiple instances behind a VIP/Load-Balancer. For High Availability, the underlying KeyProvider implementation used by the Hadoop KMS must be High Available.