commit 749708fcbb6c96f81d1f51245758b99b2e2c65b7 Author: Eric Yang Date: Fri Jul 20 13:07:29 2018 -0400 YARN-8520. Added instructions for docker user management. Contributed by Eric Yang diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DockerContainers.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DockerContainers.md index a2ef6fe..244b317 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DockerContainers.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DockerContainers.md @@ -248,7 +248,8 @@ owner as the container user. If the application owner is not a valid user in the Docker image, the application will fail. The container user is specified by the user's UID. If the user's UID is different between the NodeManager host and the Docker image, the container may be launched as the wrong user or may -fail to launch because the UID does not exist. +fail to launch because the UID does not exist. See +[User Management in Docker Container](#user-management) section for more details. Second, the Docker image must have whatever is expected by the application in order to execute. In the case of Hadoop (MapReduce or Spark), the Docker @@ -365,6 +366,122 @@ the environment variable would be set to "/sys/fs/cgroup:/sys/fs/cgroup:ro". The destination path is not restricted, "/sys/fs/cgroup:/cgroup:ro" would also be valid given the example admin whitelist. +User Management in Docker Container +----------------------------------- + +YARN Docker container support launches container with uid:gid identity knowned by the NodeManager host. For preventing security mistakes, it is recommended to keep user identity and user's primary group identity uniform across nodes and container images. This is important to ensure file permission and ownership published from Docker container to external mounts retain the same security settings. + +Docker image by default will authenticate against accounts locally stored on the container /etc/passwd and /etc/shadow. If the launched uid:gid exists in /etc/passwd of the container, the username will apear as the user in /etc/passwd entry. For keeping user identity consistent across nodes, it might be better to centralize user and group lookup through PAM. The same setup could in principle be used to authenticate against a remote database using SSSD or other PAM plugins, just the container PAM settings would be different. + +This is the traditional schema for Linux authentication: +``` +application -> libpam -> pam_authenticate -> pam_unix.so -> /etc/passwd +``` + +If we use SSSD for user lookup, it becomes: +``` +application -> libpam -> pam_authenticate -> pam_sss.so -> SSSD -> pam_unix.so -> /etc/passwd +``` + +We can bind-mount UNIX sockets SSSD communicates over into the container. This will allow the SSSD client side libraries to authenticate against the SSSD running on the host. User information does not need to exist in /etc/passwd of the docker image. + +Step by step configuration for host and container: + +## 1. Host config + + - Install packages + ``` + # yum -y install sssd-common sssd-proxy + ``` + - create a PAM service for the container. + ``` + # cat /etc/pam.d/sss_proxy + auth required pam_unix.so + account required pam_unix.so + password required pam_unix.so + session required pam_unix.so + ``` + - create SSSD config file, /etc/sssd/sssd.conf + Please note that the permissions must be 0600 and the file must be owned by root:root. + ``` + # cat /etc/sssd/sssd/conf + [sssd] + services = nss,pam + config_file_version = 2 + domains = proxy + [nss] + [pam] + [domain/proxy] + id_provider = proxy + proxy_lib_name = files + proxy_pam_target = sss_proxy + ``` + - start sssd + ``` + # systemctl start sssd + ``` + - verify a user can be retrieved with sssd + ``` + # getent passwd -s sss localuser + ``` + +## 2. Container setup + + It's important to bind-mount the /var/lib/sss/pipes directory from the host to the container since SSSD UNIX sockets are located there. + ``` + -v /var/lib/sss/pipes:/var/lib/sss/pipes:rw + ``` + +## 3. Container config + + All the steps below should be executed on the container itself. + + - Install only the sss client libraries + ``` + # yum -y install sssd-client + ``` + + - make sure sss is configured for passwd and group databases in + ``` + /etc/nsswitch.conf + ``` + + - configure the PAM service that the application uses to call into SSSD + ``` + # cat /etc/pam.d/system-auth + #%PAM-1.0 + # This file is auto-generated. + # User changes will be destroyed the next time authconfig is run. + auth required pam_env.so + auth sufficient pam_unix.so try_first_pass nullok + auth sufficient pam_sss.so forward_pass + auth required pam_deny.so + + account required pam_unix.so + account [default=bad success=ok user_unknown=ignore] pam_sss.so + account required pam_permit.so + + password requisite pam_pwquality.so try_first_pass local_users_only retry=3 authtok_type= + password sufficient pam_unix.so try_first_pass use_authtok nullok sha512 shadow + password sufficient pam_sss.so use_authtok + password required pam_deny.so + + session optional pam_keyinit.so revoke + session required pam_limits.so + -session optional pam_systemd.so + session [success=1 default=ignore] pam_succeed_if.so service in crond quiet use_uid + session required pam_unix.so + session optional pam_sss.so + ``` + + - Save the docker image and use the docker image as base image for your applications. + + - test the docker image launched in YARN environment. + ``` + $ id + uid=5000(localuser) gid=5000(localuser) groups=5000(localuser),1337(hadoop) + ``` + Privileged Container Security Consideration -------------------------------------------