Extend the idea from
HADOOP-6520 "UGI should load tokens from the environment" to a generic lightweight "keychain" design. Load keys (secrets) into a keychain in UGI (secret map) at startup. YARN will distribute them securely into each container. The Hadoop code running in the container can then retrieve the credentials from UGI.
The use case is Bring Your Own Key (BYOK) credentials for cloud connectors (adl, wasb, s3a, etc.), while Hadoop authentication is still Kerberos. No configuration change, no admin involved. It will support YARN applications initially, e.g., DistCp, Tera Suite, Spark-on-Yarn, etc.
Implementation is surprisingly simple because almost all pieces are in place:
- Retrieve secrets from UGI using conf.getPassword backed by the existing Credential Provider class UserProvider
- Reuse Credential Provider classes and interface to define local permanent or transient credential store, e.g., LocalJavaKeyStoreProvider
- New: create a new transient Credential Provider that logs into AAD with username/password or device code, and then put the Client ID and Refresh Token into the keychain
- New: create a new permanent Credential Provider based on Hadoop configuration XML, for dev/testing purpose.