Details

    • Type: Sub-task
    • Status: Patch Available
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: yarn
    • Labels:
      None

      Description

      Introduction

      Mounting files or directories from the host is one way of passing configuration and other information into a docker container.
      We could allow the user to set a list of mounts in the environment of ContainerLaunchContext (e.g. /dir1:/targetdir1,/dir2:/targetdir2).
      These would be mounted read-only to the specified target locations. This has been resolved in YARN-4595

      2.Problem Definition

      Bug mounting arbitrary volumes into a Docker container can be a security risk.

      3.Possible solutions

      one approach to provide safe mounts is to allow the cluster administrator to configure a set of parent directories as white list mounting directories.
      Add a property named yarn.nodemanager.volume-mounts.white-list, when container executor do mount checking, only the allowed directories or sub-directories can be mounted.

      1. YARN-5534.001.patch
        4 kB
        luhuichun
      2. YARN-5534.002.patch
        4 kB
        luhuichun

        Issue Links

          Activity

          Hide
          templedf Daniel Templeton added a comment -

          A good use case for this is mounting in the Hadoop directories so that they don't have to be build into the container. Another use case is mounting in the local tool chain.

          Show
          templedf Daniel Templeton added a comment - A good use case for this is mounting in the Hadoop directories so that they don't have to be build into the container. Another use case is mounting in the local tool chain.
          Hide
          luhuichun luhuichun added a comment -
          Show
          luhuichun luhuichun added a comment - Sidharta Seethana Varun Vasudev
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 19s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          0 mvndep 0m 53s Maven dependency ordering for branch
          +1 mvninstall 6m 58s trunk passed
          +1 compile 6m 40s trunk passed
          +1 checkstyle 0m 52s trunk passed
          +1 mvnsite 1m 15s trunk passed
          +1 mvneclipse 0m 45s trunk passed
          +1 findbugs 2m 10s trunk passed
          +1 javadoc 0m 57s trunk passed
          0 mvndep 0m 10s Maven dependency ordering for patch
          +1 mvninstall 0m 47s the patch passed
          +1 compile 4m 59s the patch passed
          +1 javac 4m 59s the patch passed
          -0 checkstyle 0m 49s hadoop-yarn-project/hadoop-yarn: The patch generated 3 new + 210 unchanged - 0 fixed = 213 total (was 210)
          +1 mvnsite 1m 14s the patch passed
          +1 mvneclipse 0m 45s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 2m 25s the patch passed
          +1 javadoc 0m 55s the patch passed
          -1 unit 0m 36s hadoop-yarn-api in the patch failed.
          +1 unit 15m 46s hadoop-yarn-server-nodemanager in the patch passed.
          +1 asflicense 0m 39s The patch does not generate ASF License warnings.
          58m 25s



          Reason Tests
          Failed junit tests hadoop.yarn.conf.TestYarnConfigurationFields



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:e809691
          JIRA Issue YARN-5534
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12836189/YARN-5534.001.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 3abe45fdbaac 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / f38a6d0
          Default Java 1.8.0_101
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13820/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13820/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13820/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13820/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 19s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. 0 mvndep 0m 53s Maven dependency ordering for branch +1 mvninstall 6m 58s trunk passed +1 compile 6m 40s trunk passed +1 checkstyle 0m 52s trunk passed +1 mvnsite 1m 15s trunk passed +1 mvneclipse 0m 45s trunk passed +1 findbugs 2m 10s trunk passed +1 javadoc 0m 57s trunk passed 0 mvndep 0m 10s Maven dependency ordering for patch +1 mvninstall 0m 47s the patch passed +1 compile 4m 59s the patch passed +1 javac 4m 59s the patch passed -0 checkstyle 0m 49s hadoop-yarn-project/hadoop-yarn: The patch generated 3 new + 210 unchanged - 0 fixed = 213 total (was 210) +1 mvnsite 1m 14s the patch passed +1 mvneclipse 0m 45s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 25s the patch passed +1 javadoc 0m 55s the patch passed -1 unit 0m 36s hadoop-yarn-api in the patch failed. +1 unit 15m 46s hadoop-yarn-server-nodemanager in the patch passed. +1 asflicense 0m 39s The patch does not generate ASF License warnings. 58m 25s Reason Tests Failed junit tests hadoop.yarn.conf.TestYarnConfigurationFields Subsystem Report/Notes Docker Image:yetus/hadoop:e809691 JIRA Issue YARN-5534 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12836189/YARN-5534.001.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 3abe45fdbaac 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / f38a6d0 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13820/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13820/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13820/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn Console output https://builds.apache.org/job/PreCommit-YARN-Build/13820/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          vvasudev Varun Vasudev added a comment -

          luhuichun - can you please address the issues in the Jenkins report -
          1) Please add some unit tests for the patch
          2) Please address the failing unit test

          Thanks!

          Show
          vvasudev Varun Vasudev added a comment - luhuichun - can you please address the issues in the Jenkins report - 1) Please add some unit tests for the patch 2) Please address the failing unit test Thanks!
          Hide
          templedf Daniel Templeton added a comment - - edited

          Thanks for posting the patch, luhuichun. Sorry for taking so long to get around to reviewing it. I apparently also misread the issue description the first time.

          Given that the current volume mounts only allow mounting directories from the set of localized files, I'm not sure additional white listing is all that useful. And given that YARN-5298 already mounts all the localized directories, I'm not sure this JIRA will actually change anything.

          What I originally thought I read, and what I think would be useful, is allowing arbitrary volume mounts from a whitelist, not just mounting localized resources. For example, If I'm going to use a Docker image to execute MR jobs, I have to install Hadoop in that image. When I upgrade my cluster, I then have to upgrade or recreate all my Docker images. If the Hadoop directories were mountable, I could let YARN mount them from the host machine and not have to worry about it.

          Show
          templedf Daniel Templeton added a comment - - edited Thanks for posting the patch, luhuichun . Sorry for taking so long to get around to reviewing it. I apparently also misread the issue description the first time. Given that the current volume mounts only allow mounting directories from the set of localized files, I'm not sure additional white listing is all that useful. And given that YARN-5298 already mounts all the localized directories, I'm not sure this JIRA will actually change anything. What I originally thought I read, and what I think would be useful, is allowing arbitrary volume mounts from a whitelist, not just mounting localized resources. For example, If I'm going to use a Docker image to execute MR jobs, I have to install Hadoop in that image. When I upgrade my cluster, I then have to upgrade or recreate all my Docker images. If the Hadoop directories were mountable, I could let YARN mount them from the host machine and not have to worry about it.
          Hide
          luhuichun luhuichun added a comment -

          yes, Daniel. YARN-4595 and YARN-5298 only mounts localized directories. so my idea is to allow more directories from a whitelist, which would be helpful for some applications, so what do you mean "I'm not sure this JIRA will actually change anything."? you mean it's redundant?

          Show
          luhuichun luhuichun added a comment - yes, Daniel. YARN-4595 and YARN-5298 only mounts localized directories. so my idea is to allow more directories from a whitelist, which would be helpful for some applications, so what do you mean "I'm not sure this JIRA will actually change anything."? you mean it's redundant?
          Hide
          templedf Daniel Templeton added a comment -

          validateMount() already rejects anything that isn't a localized resource path. What this patch does is reject anything that's not also whitelisted, i.e. to accept a mount, it must be both a localized resource path and whitelisted. Because YARN-5298 already mounts all localized resource paths, YARN-4595 and this patch don't accomplish much.

          Show
          templedf Daniel Templeton added a comment - validateMount() already rejects anything that isn't a localized resource path. What this patch does is reject anything that's not also whitelisted, i.e. to accept a mount, it must be both a localized resource path and whitelisted. Because YARN-5298 already mounts all localized resource paths, YARN-4595 and this patch don't accomplish much.
          Hide
          shanekumpf@gmail.com Shane Kumpf added a comment -

          Thanks for the patch luhuichun!

          I agree with Daniel Templeton. YARN-4595 only allows for mounting localized resources, which isn't flexible enough for what we need here. We'd like to eliminate issues such as YARN-5042 (mounting /sys/fs/cgroup in every container) every time we have a similar need. Another example would be /dev/urandom, which is commonly mounted into containers that generate keys.

          The current implementation is moving towards allowing subdirectories under a white listed mount to be mounted into the docker container. What is the use case for allowing subdirectories vs forcing the user supplied mount to match the white list entry?

          Here are some items to address in the future patch:

          1)

          +
          +  public static final String NM_WHITE_LIST_VOLUME_MOUNT =
          +          NM_PREFIX + "white-list-volume-mount";
          +
          

          The configuration should be under the DOCKER_CONTAINER_RUNTIME_PREFIX.

          2)

                   if (!path.isAbsolute()) {
                     throw new ContainerExecutionException("Mount must be absolute: " +
          -              mount);
          +                  mount);
                   }
                   if (Files.isSymbolicLink(path)) {
                     throw new ContainerExecutionException("Mount cannot be a symlink: " +
          -              mount);
          +                  mount);
          

          Can you fix the formatting changes here?

          3)

          +  private boolean isSubDirectory(File parent, File child){
          +    try {
          +      parent = parent.getCanonicalFile();
          +      child = child.getCanonicalFile();
          +      File parentFile = child;
          +      while (parentFile != null){
          +        if (parent.equals(parentFile)){
          +          return true;
          +        }
          +        parentFile = parentFile.getParentFile();
          +      }
          +    } catch (IOException e) {
          +      e.printStackTrace();
          +    }
          +    return false;
             }
          

          Assuming we need subdirectories, I would expect a subdirectory check has been implemented elsewhere in the code base and doesn't need to be defined again, but I didn't find it in my non-exhaustive search. Perhaps someone else can chime in if they know of one. Otherwise, maybe it would be good to add this to a utility class vs leaving it in the runtime?

          Show
          shanekumpf@gmail.com Shane Kumpf added a comment - Thanks for the patch luhuichun ! I agree with Daniel Templeton . YARN-4595 only allows for mounting localized resources, which isn't flexible enough for what we need here. We'd like to eliminate issues such as YARN-5042 (mounting /sys/fs/cgroup in every container) every time we have a similar need. Another example would be /dev/urandom, which is commonly mounted into containers that generate keys. The current implementation is moving towards allowing subdirectories under a white listed mount to be mounted into the docker container. What is the use case for allowing subdirectories vs forcing the user supplied mount to match the white list entry? Here are some items to address in the future patch: 1) + + public static final String NM_WHITE_LIST_VOLUME_MOUNT = + NM_PREFIX + "white-list-volume-mount" ; + The configuration should be under the DOCKER_CONTAINER_RUNTIME_PREFIX . 2) if (!path.isAbsolute()) { throw new ContainerExecutionException( "Mount must be absolute: " + - mount); + mount); } if (Files.isSymbolicLink(path)) { throw new ContainerExecutionException( "Mount cannot be a symlink: " + - mount); + mount); Can you fix the formatting changes here? 3) + private boolean isSubDirectory(File parent, File child){ + try { + parent = parent.getCanonicalFile(); + child = child.getCanonicalFile(); + File parentFile = child; + while (parentFile != null ){ + if (parent.equals(parentFile)){ + return true ; + } + parentFile = parentFile.getParentFile(); + } + } catch (IOException e) { + e.printStackTrace(); + } + return false ; } Assuming we need subdirectories, I would expect a subdirectory check has been implemented elsewhere in the code base and doesn't need to be defined again, but I didn't find it in my non-exhaustive search. Perhaps someone else can chime in if they know of one. Otherwise, maybe it would be good to add this to a utility class vs leaving it in the runtime?
          Hide
          templedf Daniel Templeton added a comment - - edited

          Thanks for updating the patch, luhuichun. The new patch matches more with what I had in mind. There are still a couple of issues:

          • In YarnConfiguration, you should add a Javadoc comment for the new property
          • In DLCR.isArbitraryMount(), instead of the for loop, you should use a foreach.
          • DSCR.isArbitraryMount() always returns false.
                File child = new File(mount);
                for (int i = 0; i < whiteList.length; i++){
                  File parent = new File(mount);
                  if (isSubDirectory(parent, child)){
                    return false;
                  }
                }

            It's always true that parent.equals(child), so isSubdirectory() will always return true.

          • You should parse the white list property once and store it instead of doing it every time the method is called
          • Instead of using String.split(), you could use Pattern.split() to allow for whitespace, making it a little more user friendly
          • Look at http://stackoverflow.com/questions/26530445/compare-directories-to-check-if-one-is-sub-directory-of-another for ideas of how to do the subdirectory check more efficiently. I like the Set idea.
          • In DLCR.isSubdirectory(), the naming around child, parent, and parentFile is pretty confusing.
          • In DLCR.isSubdirectory(), printing the stack trace is a bad idea. Do something more useful. At a minimum, log the exception instead of printing the stack trace to stderr.
          • You should have a space before the curly brace throughout, e.g.
              if (parent.equals(parentFile)){  
          Show
          templedf Daniel Templeton added a comment - - edited Thanks for updating the patch, luhuichun . The new patch matches more with what I had in mind. There are still a couple of issues: In YarnConfiguration , you should add a Javadoc comment for the new property In DLCR.isArbitraryMount() , instead of the for loop, you should use a foreach . DSCR.isArbitraryMount() always returns false . File child = new File(mount); for ( int i = 0; i < whiteList.length; i++){ File parent = new File(mount); if (isSubDirectory(parent, child)){ return false ; } } It's always true that parent.equals(child) , so isSubdirectory() will always return true. You should parse the white list property once and store it instead of doing it every time the method is called Instead of using String.split() , you could use Pattern.split() to allow for whitespace, making it a little more user friendly Look at http://stackoverflow.com/questions/26530445/compare-directories-to-check-if-one-is-sub-directory-of-another for ideas of how to do the subdirectory check more efficiently. I like the Set idea. In DLCR.isSubdirectory() , the naming around child , parent , and parentFile is pretty confusing. In DLCR.isSubdirectory() , printing the stack trace is a bad idea. Do something more useful. At a minimum, log the exception instead of printing the stack trace to stderr. You should have a space before the curly brace throughout, e.g. if (parent.equals(parentFile)){
          Hide
          shanekumpf@gmail.com Shane Kumpf added a comment -

          luhuichun Zhankun Tang - We're close on this one. Would you like me to take the lead on this and get it wrapped up? Thanks!

          Show
          shanekumpf@gmail.com Shane Kumpf added a comment - luhuichun Zhankun Tang - We're close on this one. Would you like me to take the lead on this and get it wrapped up? Thanks!

            People

            • Assignee:
              luhuichun luhuichun
              Reporter:
              luhuichun luhuichun
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:

                Development