Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-10248

when config allowed-gpu-devices , excluded GPUs still be visible to containers

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Minor
    • Resolution: Unresolved
    • 3.2.1
    • None
    • nodemanager
    • Patch

    Description

      I have a server with two GPU, and i want to use only one of them within yarn cluster.
      according to hadoop document, i set configs:

      <property>
          <name>yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices</name>
          <value>0:1</value>
        </property>
          <property>
          <name>yarn.nodemanager.resource-plugins.gpu.path-to-discovery-executables</name>
          <value>/etc/alternatives/x86_64-linux-gnu_nvidia_smi</value>
        </property>
      

      then i running following command to test:

      yarn jar ./share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.1.jar \
               -jar ./share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.1.jar  -shell_command ' nvidia-smi & sleep 3  ' \
               -container_resources memory-mb=3072,vcores=1,yarn.io/gpu=1  \
               -num_containers 1 -queue yufei -node_label_expression slaves
      

      iI expected gpu with minor number 0 will not visible to container, but in the launched container, nvidia-smi print two gpu information.

      I check the related source code and find it is a bug.
      the problem is:
      when you specify allowed-gpu-devices, GpuDiscoverer will populate usable gpus from it,
      then when assign to a container some of the gpus, it will set denied gpus for the container,
      but it never consider excluded gpu of the host.

      Attachments

        1. YARN-10248-branch-3.2.001.path
          17 kB
          zhao yufei
        2. YARN-10248-branch-3.2.001.path
          17 kB
          zhao yufei

        Issue Links

          Activity

            People

              jasstionzyf zhao yufei
              jasstionzyf zhao yufei
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: