Description
This patch is to address issues when docker container is being used:
1. GPU driver and nvidia libraries: If GPU drivers and NV libraries are pre-packaged inside docker image, it could conflict to driver and nvidia-libraries installed on Host OS. An alternative solution is to detect Host OS's installed drivers and devices, mount it when launch docker container. Please refer to [1] for more details.
2. Image detection:
From [2], the challenge is:
Mounting user-level driver libraries and device files clobbers the environment of the container, it should be done only when the container is running a GPU application. The challenge here is to determine if a given image will be using the GPU or not. We should also prevent launching containers based on a Docker image that is incompatible with the host NVIDIA driver version, you can find more details on this wiki page.
3. GPU isolation.
Proposed solution:
a. Use nvidia-docker-plugin [3] to address issue #1, this is the same solution used by K8S [4]. issue #2 could be addressed in a separate JIRA.
We won't ship nvidia-docker-plugin with out releases and we require cluster admin to preinstall nvidia-docker-plugin to use GPU+docker support on YARN. "nvidia-docker" is a wrapper of docker binary which can address #3 as well, however "nvidia-docker" doesn't provide same semantics of docker, and it needs to setup additional environments such as PATH/LD_LIBRARY_PATH to use it. To avoid introducing additional issues, we plan to use nvidia-docker-plugin + docker binary approach.
b. To address GPU driver and nvidia libraries, we uses nvidia-docker-plugin [3] to create a volume which includes GPU-related libraries and mount it when docker container being launched. Changes include:
- Instead of using volume-driver, this patch added docker volume create command to c-e and NM Java side. The reason is volume-driver can only use single volume driver for each launched docker container.
- Updated c-e and Java side, if a mounted volume is a named volume in docker, skip checking file existence. (Named-volume still need to be added to permitted list of container-executor.cfg).
c. To address isolation issue:
We found that, cgroup + docker doesn't work under newer docker version which uses runc as default runtime. Setting --cgroup-parent to a cgroup which include any devices.deny causes docker container cannot be launched.
Instead this patch passes allowed GPU devices via --device to docker launch command.
References:
[1] https://github.com/NVIDIA/nvidia-docker/wiki/NVIDIA-driver
[2] https://github.com/NVIDIA/nvidia-docker/wiki/Image-inspection
[3] https://github.com/NVIDIA/nvidia-docker/wiki/nvidia-docker-plugin
[4] https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/
Attachments
Attachments
Issue Links
- relates to
-
YARN-9174 Backport YARN-7224 for refactoring of GpuDevice class
- Resolved