Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3611 Support Docker Containers In LinuxContainerExecutor
  3. YARN-4266

Allow whitelisted users to disable user re-mapping/squashing when launching docker containers

    Details

    • Type: Sub-task
    • Status: Patch Available
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: yarn
    • Labels:
      None

      Description

      Docker provides a mechanism (the --user switch) that enables us to specify the user the container processes should run as. We use this mechanism today when launching docker containers . In non-secure mode, we run the docker container based on `yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user` and in secure mode, as the submitting user. However, this mechanism breaks down with a large number of 'pre-created' images which don't necessarily have the users available within the image. Examples of such images include shared images that need to be used by multiple users. We need a way in which we can allow a pre-defined set of users to run containers based on existing images, without using the --user switch. There are some implications of disabling this user squashing that we'll need to work through : log aggregation, artifact deletion etc.,

        Issue Links

          Activity

          Hide
          tangzhankun Zhankun Tang added a comment -

          Sidharta Seethana, Any progress/update on this? We would like to help pushing this if you will.

          Show
          tangzhankun Zhankun Tang added a comment - Sidharta Seethana , Any progress/update on this? We would like to help pushing this if you will.
          Hide
          sidharta-s Sidharta Seethana added a comment -

          Zhankun Tang, please feel free to take this over. Do you have a proposal for dealing with logs, directories etc?

          Show
          sidharta-s Sidharta Seethana added a comment - Zhankun Tang , please feel free to take this over. Do you have a proposal for dealing with logs, directories etc?
          Hide
          tangzhankun Zhankun Tang added a comment -

          Not yet. I'll submit the proposal asap. Thanks.

          Show
          tangzhankun Zhankun Tang added a comment - Not yet. I'll submit the proposal asap. Thanks.
          Hide
          tangzhankun Zhankun Tang added a comment -

          A draft proposal uploaded. Please review. Sidharta Seethana, Daniel Templeton, Varun Vasudev

          Show
          tangzhankun Zhankun Tang added a comment - A draft proposal uploaded. Please review. Sidharta Seethana , Daniel Templeton , Varun Vasudev
          Hide
          templedf Daniel Templeton added a comment -

          I don't really like either option. 3.1 is a no-go for security reasons. 3.2 seems risky. First, just running usermod isn't enough, as explained in https://muffinresearch.co.uk/linux-changing-uids-and-gids-for-user/ Second, how can you guarantee that the new UID isn't already in use in the container? Third, the reason we're having this discussion is because it's desirable to run a container without modification. Modifying the container is not a valid alternative to modifying the container. I think we still have some brainstorming to do.

          Show
          templedf Daniel Templeton added a comment - I don't really like either option. 3.1 is a no-go for security reasons. 3.2 seems risky. First, just running usermod isn't enough, as explained in https://muffinresearch.co.uk/linux-changing-uids-and-gids-for-user/ Second, how can you guarantee that the new UID isn't already in use in the container? Third, the reason we're having this discussion is because it's desirable to run a container without modification. Modifying the container is not a valid alternative to modifying the container. I think we still have some brainstorming to do.
          Hide
          tangzhankun Zhankun Tang added a comment -

          Thanks for the review and poiting out the GID thing. Daniel Templeton.
          I have done testings based on quick implementations for the two ways of option 3.2. Based on testings, a generic launch_container.sh serving all scenarios seems not possible. So leaving launch_container.sh and modify it only in LCE when white-listed user is preferred. This ensure that current use cases won't be affected. The only known limitation of this solution is that we need the Docker image to package "sudo" command in it because we leverage it to switch user in container at run-time.

          But as you mentioned, there are still more concerns for my preferred solution:

          • What if the new UID/GID is already in use in the container?
            The usermod and groupmod commands both has an "-o" option that allow using duplicate (non-unique) UID/GID. So we are able to change the UID and GUID to what we want without failure.
          • How can we run a container without modification?
            The answer depends on the container application type(One thing to note is that if we allow a container run as root for white-listed users, we don't need to modify container. This needs discussion too but I guess it's hard to get approved).
            • For applications in container that have dependencies on local host, like MRv2 application:
              The applications would almost certain encounter the log file write permission and the secure token read permission issues due to the "bind-mounted" directory.
            • For applications in container don't have dependencies on local host, like web application:
              We seem need only to solve the log write permission during my testing (not sure for the log aggregation). Or find another way which don't persist log in local host.
          • To sum up, if we don't want to change container at all, the known blocking issues here are how to collect log from container and how to feed the container secure token in a different way from "bind-mount" host directory. IMO, it's nearly impossible because the only way to connect container and host seems "bind-mount" or it will involve modification to container.

          But again, current MRv2/spark in container already needs UID modification in Docker images, requirements of "sudo" installed in Docker images is not unacceptable to enable this white-listed user as a light weight solution.
          Anyway, I'm open to any better solution. Ideas? Daniel Templeton, Varun Vasudev, Sidharta Seethana

          Show
          tangzhankun Zhankun Tang added a comment - Thanks for the review and poiting out the GID thing. Daniel Templeton . I have done testings based on quick implementations for the two ways of option 3.2. Based on testings, a generic launch_container.sh serving all scenarios seems not possible. So leaving launch_container.sh and modify it only in LCE when white-listed user is preferred. This ensure that current use cases won't be affected. The only known limitation of this solution is that we need the Docker image to package "sudo" command in it because we leverage it to switch user in container at run-time. But as you mentioned, there are still more concerns for my preferred solution: What if the new UID/GID is already in use in the container? The usermod and groupmod commands both has an "-o" option that allow using duplicate (non-unique) UID/GID. So we are able to change the UID and GUID to what we want without failure. How can we run a container without modification? The answer depends on the container application type(One thing to note is that if we allow a container run as root for white-listed users, we don't need to modify container. This needs discussion too but I guess it's hard to get approved). For applications in container that have dependencies on local host, like MRv2 application: The applications would almost certain encounter the log file write permission and the secure token read permission issues due to the "bind-mounted" directory. For applications in container don't have dependencies on local host, like web application: We seem need only to solve the log write permission during my testing (not sure for the log aggregation). Or find another way which don't persist log in local host. To sum up, if we don't want to change container at all, the known blocking issues here are how to collect log from container and how to feed the container secure token in a different way from "bind-mount" host directory . IMO, it's nearly impossible because the only way to connect container and host seems "bind-mount" or it will involve modification to container. But again, current MRv2/spark in container already needs UID modification in Docker images, requirements of "sudo" installed in Docker images is not unacceptable to enable this white-listed user as a light weight solution. Anyway, I'm open to any better solution. Ideas? Daniel Templeton , Varun Vasudev , Sidharta Seethana
          Hide
          tangzhankun Zhankun Tang added a comment -

          Based on testings, update the proposal

          Show
          tangzhankun Zhankun Tang added a comment - Based on testings, update the proposal
          Hide
          templedf Daniel Templeton added a comment - - edited

          After some discussion and thought, I think that 3.2 is OK as long as the user has the option to turn it off. In other words, the contract is that either the user takes care of making sure the container has the required user, or let YARN do it for you.

          It might be worth exploring how far dockers logs could take us in dealing with the logging end of the problem.

          Any comments, Sidharta Seethana, Varun Vasudev, or Shane Kumpf?

          Show
          templedf Daniel Templeton added a comment - - edited After some discussion and thought, I think that 3.2 is OK as long as the user has the option to turn it off. In other words, the contract is that either the user takes care of making sure the container has the required user, or let YARN do it for you. It might be worth exploring how far dockers logs could take us in dealing with the logging end of the problem. Any comments, Sidharta Seethana , Varun Vasudev , or Shane Kumpf ?
          Hide
          tangzhankun Zhankun Tang added a comment -

          Updated the proposal(5.4 detailed implementation)

          Show
          tangzhankun Zhankun Tang added a comment - Updated the proposal(5.4 detailed implementation)
          Hide
          tangzhankun Zhankun Tang added a comment - - edited

          A patch based on branch-2.8 for your reference (manual tests including MRv2, spark, dockerized MRv2 passed). Will add unit tests and re-base to trunk after we get agreement.

          Show
          tangzhankun Zhankun Tang added a comment - - edited A patch based on branch-2.8 for your reference (manual tests including MRv2, spark, dockerized MRv2 passed). Will add unit tests and re-base to trunk after we get agreement.
          Hide
          sidharta-s Sidharta Seethana added a comment -

          Usermod seems to be of limited use. From usermod's man page :

          -u, --uid UID
                     The new numerical value of the user's ID.
          
                     This value must be unique, unless the -o option is used. The value must be non-negative.
          
                     The user's mailbox, and any files which the user owns and which are located in the user's home directory will have the file user ID changed automatically.
          
                     The ownership of files outside of the user's home directory must be fixed manually.
          
                     No checks will be performed with regard to the UID_MIN, UID_MAX, SYS_UID_MIN, or SYS_UID_MAX from /etc/login.defs.
          

          If nothing outside the user's home directory is updated, this is likely to break many images that use custom/non-root users ?

          Show
          sidharta-s Sidharta Seethana added a comment - Usermod seems to be of limited use. From usermod's man page : -u, --uid UID The new numerical value of the user's ID. This value must be unique, unless the -o option is used. The value must be non-negative. The user's mailbox, and any files which the user owns and which are located in the user's home directory will have the file user ID changed automatically. The ownership of files outside of the user's home directory must be fixed manually. No checks will be performed with regard to the UID_MIN, UID_MAX, SYS_UID_MIN, or SYS_UID_MAX from /etc/login.defs. If nothing outside the user's home directory is updated, this is likely to break many images that use custom/non-root users ?
          Hide
          tangzhankun Zhankun Tang added a comment -

          Sidharta Seethana Although we can alleviate it by "find / -user <OLDUID> -exec chown -h <NEWUID> {} \;", I'm afraid this will cost overhead if nothing outside the user's home directory needs ownership changes. Or we can just remind end user about this limitation if we don't want this overhead

          Show
          tangzhankun Zhankun Tang added a comment - Sidharta Seethana Although we can alleviate it by "find / -user <OLDUID> -exec chown -h <NEWUID> {} \;", I'm afraid this will cost overhead if nothing outside the user's home directory needs ownership changes. Or we can just remind end user about this limitation if we don't want this overhead
          Hide
          sidharta-s Sidharta Seethana added a comment -

          IMO, the usermod approach described in the design document is error prone and likely to break in various ways.

          • Like I mentioned in a earlier comment, the usermod operation only makes changes to the home directory and not elsewhere in the image. Modifying the rest of the image is not scalable and could significantly slow down the launch of every container where we go down this path.
          • Running this usermod operation also requires that launch_container.sh be launched as a privileged user. Also, note that running docker run --pid=host … bash ../launch_container.sh $newUID $containerUsername does not guarantee that launch_container.sh as described here will work correctly - if the image has a ‘USER’ directive, launch_container.sh will be run as that user and that user might not have privileges to run a usermod operation.
          • In addition, I don’t believe we should be using ­­—pid=host. This exposes other containers’s processes into this container - which will break isolation and possibly behavior for certain applications (applications that assume a single instance is running on a ’node’, for example).
          • Lastly, adding more commands that run inside the container (usermod in this case) adds even more requirements for the docker image being launched : we already require that bash, find, ls etc be present in the image.

          I can’t think of a way where we can find a generic solution for disabling —user that will work for all (or even a large number of) scenarios while still being in line with YARN’s security model/log aggregation etc., . (I’d be happy to be proven wrong here). I think we need to acknowledge this and find a set of canonical use cases that we want to support - and see how we can enable each one of them. Couple of examples off the top of my head : 1) Spark/MRv2 - IMO, it maybe easier to support these because they are already first class YARN applications. 2) Apache/httpd based images - more digging to do there.

          Thoughts ? I apologize for my spotty availability on this JIRA - I’ll try to provide more prompt responses/feedback in the future.

          Show
          sidharta-s Sidharta Seethana added a comment - IMO, the usermod approach described in the design document is error prone and likely to break in various ways. Like I mentioned in a earlier comment, the usermod operation only makes changes to the home directory and not elsewhere in the image. Modifying the rest of the image is not scalable and could significantly slow down the launch of every container where we go down this path. Running this usermod operation also requires that launch_container.sh be launched as a privileged user. Also, note that running docker run --pid=host … bash ../launch_container.sh $newUID $containerUsername does not guarantee that launch_container.sh as described here will work correctly - if the image has a ‘USER’ directive, launch_container.sh will be run as that user and that user might not have privileges to run a usermod operation. In addition, I don’t believe we should be using ­­—pid=host. This exposes other containers’s processes into this container - which will break isolation and possibly behavior for certain applications (applications that assume a single instance is running on a ’node’, for example). Lastly, adding more commands that run inside the container (usermod in this case) adds even more requirements for the docker image being launched : we already require that bash, find, ls etc be present in the image. I can’t think of a way where we can find a generic solution for disabling —user that will work for all (or even a large number of) scenarios while still being in line with YARN’s security model/log aggregation etc., . (I’d be happy to be proven wrong here). I think we need to acknowledge this and find a set of canonical use cases that we want to support - and see how we can enable each one of them. Couple of examples off the top of my head : 1) Spark/MRv2 - IMO, it maybe easier to support these because they are already first class YARN applications. 2) Apache/httpd based images - more digging to do there. Thoughts ? I apologize for my spotty availability on this JIRA - I’ll try to provide more prompt responses/feedback in the future.
          Hide
          tangzhankun Zhankun Tang added a comment - - edited

          Sidharta Seethana, Great thanks for reviewing this!

          Like I mentioned in a earlier comment, the usermod operation only makes changes to the home directory and not elsewhere in the image. Modifying the rest of the image is not scalable and could significantly slow down the launch of every container where we go down this path.

          Yes. Agree with this. This is a drawback that we cannot avoid at present.

          Running this usermod operation also requires that launch_container.sh be launched as a privileged user. Also, note that running docker run --pid=host … bash ../launch_container.sh $newUID $containerUsername does not guarantee that launch_container.sh as described here will work correctly - if the image has a ‘USER’ directive, launch_container.sh will be run as that user and that user might not have privileges to run a usermod operation.

          You might missed the part in the patch that we'll use "--user=root" to guarantee successful "usermod". We first inspect the Docker image, if it setup a non-root user and wants to run with it, we'll use "--user=root". If the setup user in image is root, we'll just let it go.

          In addition, I don’t believe we should be using ­­—pid=host. This exposes other containers’s processes into this container - which will break isolation and possibly behavior for certain applications (applications that assume a single instance is running on a ’node’, for example).

          thanks for pointing this. I forget to delete this when I'm trying different implementation(sudo issue if I remember correctly). I have a double-check and --pid=host is not needed.

          Lastly, adding more commands that run inside the container (usermod in this case) adds even more requirements for the docker image being launched : we already require that bash, find, ls etc be present in the image.

          This usermod is installed by default in most distributions I guess. Since we already require several commands in the image, why cannot we document this too?

          IMO, this is the light-weight way that can work securely and won't break down the log staff. The drawbacks are:
          1. usermod is a requirement in Docker image
          2. usermod only changes the UID of files in home directory.
          I indeed got some complaint about current user remapping from customer. So I think this JIRA is an important feature to make YARN a good supporter for container(Docker and others maybe) and possibly not only big data Docker images. We should invite more people on this. Daniel Templeton, Varun Vasudev, Shane Kumpf, Zhongyue Nah?

          Show
          tangzhankun Zhankun Tang added a comment - - edited Sidharta Seethana , Great thanks for reviewing this! Like I mentioned in a earlier comment, the usermod operation only makes changes to the home directory and not elsewhere in the image. Modifying the rest of the image is not scalable and could significantly slow down the launch of every container where we go down this path. Yes. Agree with this. This is a drawback that we cannot avoid at present. Running this usermod operation also requires that launch_container.sh be launched as a privileged user. Also, note that running docker run --pid=host … bash ../launch_container.sh $newUID $containerUsername does not guarantee that launch_container.sh as described here will work correctly - if the image has a ‘USER’ directive, launch_container.sh will be run as that user and that user might not have privileges to run a usermod operation. You might missed the part in the patch that we'll use "--user=root" to guarantee successful "usermod". We first inspect the Docker image, if it setup a non-root user and wants to run with it, we'll use "--user=root". If the setup user in image is root, we'll just let it go. In addition, I don’t believe we should be using ­­—pid=host. This exposes other containers’s processes into this container - which will break isolation and possibly behavior for certain applications (applications that assume a single instance is running on a ’node’, for example). thanks for pointing this. I forget to delete this when I'm trying different implementation(sudo issue if I remember correctly). I have a double-check and --pid=host is not needed. Lastly, adding more commands that run inside the container (usermod in this case) adds even more requirements for the docker image being launched : we already require that bash, find, ls etc be present in the image. This usermod is installed by default in most distributions I guess. Since we already require several commands in the image, why cannot we document this too? IMO, this is the light-weight way that can work securely and won't break down the log staff. The drawbacks are: 1. usermod is a requirement in Docker image 2. usermod only changes the UID of files in home directory. I indeed got some complaint about current user remapping from customer. So I think this JIRA is an important feature to make YARN a good supporter for container(Docker and others maybe) and possibly not only big data Docker images. We should invite more people on this. Daniel Templeton , Varun Vasudev , Shane Kumpf , Zhongyue Nah ?
          Hide
          shanekumpf@gmail.com Shane Kumpf added a comment -

          Thanks for the design document and discussion on the approaches, Zhankun Tang! I think we can agree that there are challenges with all of the proposed approaches, but it seems an ideal approach may not exist today. As you and others have called out; 3.1 has security implications, 3.2 could introduce significant overhead, 3.3 depends on docker logs which is error prone and handling of tokens is an unknown.

          Given these approaches, on the surface, 3.3 seems like the least invasive wrt container changes. Eliminating the writable bind mounts may also make security easier to grok. Getting the token into the container doesn't seem all that difficult to address. How do others feel about 3.3?

          Not to hijack, but on a related note, I still believe user namespace remapping will be our future solution here. User namespace remapping would allow us to map the root user in the container, to the run as user on the host, eliminating many of the issues. I revisited this feature this morning in hopes it had evolved in the last couple of releases of Docker, but unfortunately it hasn't. The current user namespace remapping feature in docker can only be applied to a single user and is set at the daemon level, which will not work for us in both non-secure and secure modes. I believe it would currently be possible to support user namespace remapping for non-secure mode, but not both. Many issues are opened on the docker github requesting per container user namespace remapping, but the sharing of image layers makes this non-trivial to add. I really don't like the idea of varying approaches for secure and non-secure mode, but I would be happy to work on this approach for non-secure containers if others feel it is worth pursuing.

          Show
          shanekumpf@gmail.com Shane Kumpf added a comment - Thanks for the design document and discussion on the approaches, Zhankun Tang ! I think we can agree that there are challenges with all of the proposed approaches, but it seems an ideal approach may not exist today. As you and others have called out; 3.1 has security implications, 3.2 could introduce significant overhead, 3.3 depends on docker logs which is error prone and handling of tokens is an unknown. Given these approaches, on the surface, 3.3 seems like the least invasive wrt container changes. Eliminating the writable bind mounts may also make security easier to grok. Getting the token into the container doesn't seem all that difficult to address. How do others feel about 3.3? Not to hijack, but on a related note, I still believe user namespace remapping will be our future solution here. User namespace remapping would allow us to map the root user in the container, to the run as user on the host, eliminating many of the issues. I revisited this feature this morning in hopes it had evolved in the last couple of releases of Docker, but unfortunately it hasn't. The current user namespace remapping feature in docker can only be applied to a single user and is set at the daemon level, which will not work for us in both non-secure and secure modes. I believe it would currently be possible to support user namespace remapping for non-secure mode, but not both. Many issues are opened on the docker github requesting per container user namespace remapping, but the sharing of image layers makes this non-trivial to add. I really don't like the idea of varying approaches for secure and non-secure mode, but I would be happy to work on this approach for non-secure containers if others feel it is worth pursuing.
          Hide
          tangzhankun Zhankun Tang added a comment -

          Shane Kumpf, Thanks for the comments. Yes. Docker is not ready on this user remapping and seems long time to go.
          But still, one thing I need to clarify on 3.2 implementation is that, I prefer documentation the limitation rather than changing UID of non-home directories because significant overhead involved. Does this make sense to you?

          Show
          tangzhankun Zhankun Tang added a comment - Shane Kumpf , Thanks for the comments. Yes. Docker is not ready on this user remapping and seems long time to go. But still, one thing I need to clarify on 3.2 implementation is that, I prefer documentation the limitation rather than changing UID of non-home directories because significant overhead involved. Does this make sense to you?
          Hide
          templedf Daniel Templeton added a comment -

          I don't have an issue taking the usermod path, assuming that we're only changing the UID on the home dir and that we clearly document what we do and what to do if you want to opt out. That we take the usermod path now does not preclude continued work to improve the process, such as trying to make the Docker logs approach work. It would be bad if we end up with more than one way to solve the problem that we expose to the user, but I see no issue with evolving the implementation if we find a better solution. The end users shouldn't care if a later implementation reduces the constraints or improves the process.

          That said, I think it behooves us to at least investigate whether we can make the docker logs approach work. I know we've done some initial research, but have we gone far enough to know for sure how viable it is?

          Show
          templedf Daniel Templeton added a comment - I don't have an issue taking the usermod path, assuming that we're only changing the UID on the home dir and that we clearly document what we do and what to do if you want to opt out. That we take the usermod path now does not preclude continued work to improve the process, such as trying to make the Docker logs approach work. It would be bad if we end up with more than one way to solve the problem that we expose to the user, but I see no issue with evolving the implementation if we find a better solution. The end users shouldn't care if a later implementation reduces the constraints or improves the process. That said, I think it behooves us to at least investigate whether we can make the docker logs approach work. I know we've done some initial research, but have we gone far enough to know for sure how viable it is?
          Hide
          ebadger Eric Badger added a comment -

          Zhankun Tang, I tried out the most recent patch that you put up and I have a few comments.

          +      if (disableUserRemapping) {
          
          +      if (disableUserRemapping && targetUID != null) {
          
          • Shouldn't disableUserRemapping be negated in these statements? We want to do the remapping if disableUserRemapping is not disabled.
          +        containerPredefinedUser = getDockerImageInfo("{{.Config.User}}", imageName, containerIdStr);
          
          • There is no check to see whether containerPredefinedUser actually got set to anything. It's possible for docker inspect to not return a predefined user. In this case, we will be unable to do remapping and the usermod command will fail because the user will be blank.
          +        //get runAsUser's UID for container to usermod when init
          +        if (!containerPredefinedUser.equals("root")) {
          +          targetUID = getLocalUid(runAsUser);
          +        }
          
          • I think checking containerPredefinedUser misses some cases here. You may still want the container to be run as a different user even if the predefined user is root. If we don't remap when the predefined user is root, then anything written out to shared data volumes will have messed up permissions outside of the container.
          +        String cmd = "\"usermod -o -u " + targetUID + " " + containerPredefinedUser
          +            + " && su " + containerPredefinedUser + " bash -c '"
          
          • It's not guaranteed that the predefined user has /bin/bash shell permissions. So it may be prudent to add a -s /bin/bash to the usermod command.

          Making the above changes I've been able to successfully submit and run jobs as multiple different users without permissions issues. The only necessity seems to be that there be a predefined user in the image that is being used.

          Additionally, this usermod approach doesn't currently deal with group permissions at all, which could be an issue especially in multi-tenant clusters.

          Show
          ebadger Eric Badger added a comment - Zhankun Tang , I tried out the most recent patch that you put up and I have a few comments. + if (disableUserRemapping) { + if (disableUserRemapping && targetUID != null) { Shouldn't disableUserRemapping be negated in these statements? We want to do the remapping if disableUserRemapping is not disabled. + containerPredefinedUser = getDockerImageInfo("{{.Config.User}}", imageName, containerIdStr); There is no check to see whether containerPredefinedUser actually got set to anything. It's possible for docker inspect to not return a predefined user. In this case, we will be unable to do remapping and the usermod command will fail because the user will be blank. + //get runAsUser's UID for container to usermod when init + if (!containerPredefinedUser.equals("root")) { + targetUID = getLocalUid(runAsUser); + } I think checking containerPredefinedUser misses some cases here. You may still want the container to be run as a different user even if the predefined user is root. If we don't remap when the predefined user is root, then anything written out to shared data volumes will have messed up permissions outside of the container. + String cmd = "\"usermod -o -u " + targetUID + " " + containerPredefinedUser + + " && su " + containerPredefinedUser + " bash -c '" It's not guaranteed that the predefined user has /bin/bash shell permissions. So it may be prudent to add a -s /bin/bash to the usermod command. Making the above changes I've been able to successfully submit and run jobs as multiple different users without permissions issues. The only necessity seems to be that there be a predefined user in the image that is being used. Additionally, this usermod approach doesn't currently deal with group permissions at all, which could be an issue especially in multi-tenant clusters.
          Hide
          ebadger Eric Badger added a comment -

          Also, as an aside, is there any reason that we want to go with the usermod approach instead of leveraging docker to do this for us in the run command?

          https://docs.docker.com/engine/reference/run/#user

          We can still use the --user flag, but instead of passing a username we can just pass the UID/GID of runAsUser. According to the docker documentation, it will create the user if it doesn't exist, which means that we don't need to have a predefined user, plus we don't need to start the container as root. Until we get to user namespace remapping as Shane Kumpf eluded to earlier, it seems to me that this would be a less hacky way to get around the permissions issues. Thoughts?

          cc: Zhankun Tang, Sidharta Seethana, Daniel Templeton

          Show
          ebadger Eric Badger added a comment - Also, as an aside, is there any reason that we want to go with the usermod approach instead of leveraging docker to do this for us in the run command? https://docs.docker.com/engine/reference/run/#user We can still use the --user flag, but instead of passing a username we can just pass the UID/GID of runAsUser. According to the docker documentation, it will create the user if it doesn't exist, which means that we don't need to have a predefined user, plus we don't need to start the container as root. Until we get to user namespace remapping as Shane Kumpf eluded to earlier, it seems to me that this would be a less hacky way to get around the permissions issues. Thoughts? cc: Zhankun Tang , Sidharta Seethana , Daniel Templeton
          Hide
          templedf Daniel Templeton added a comment -

          Thanks for testing the patch, Eric Badger! The reason we can't just specify the user in the run command is that the NM will write the launch script and all the other data for the job into the working directory owned by the job owner. We then mount that directory into the Docker container and exec it. If we set the user in the run command, the Docker container wouldn't have the permissions to access the directory or launch the script. If we tried to write the working directory et al as the user ID we intend to use in the Docker container, we open a potential security hole, because the user with that ID on the NM would be able to access it.

          Show
          templedf Daniel Templeton added a comment - Thanks for testing the patch, Eric Badger ! The reason we can't just specify the user in the run command is that the NM will write the launch script and all the other data for the job into the working directory owned by the job owner. We then mount that directory into the Docker container and exec it. If we set the user in the run command, the Docker container wouldn't have the permissions to access the directory or launch the script. If we tried to write the working directory et al as the user ID we intend to use in the Docker container, we open a potential security hole, because the user with that ID on the NM would be able to access it.
          Hide
          ebadger Eric Badger added a comment -

          Daniel Templeton, I think you're misunderstanding what I was saying. When we instantiate the DockerRunCommand, we give it a user to run as. In this user remapping case, we've been setting that user to root so that we have permissions to perform the usermod remapping in the container. I'm saying, however, that we could set that use to the UID and GID of the user that submitted the job (i.e. runAsUser in the code).

          So let's say user foo:1000:1000 submitted a job. The NM will create the launch_container.sh script assuming user foo. Then we will go to launch the docker container. When we instantiate the DockerRunCommand, we would pass it the output of id -u and id -g. This sort of thing is already being done in Zhankun Tang's patch to get targetUID. The result would give us a command that looks like: docker run --user=1000:1000 .... There isn't a security hole here that I can see because the user in the container will have the same UID/GID as the user that submitted the job. Inside the container, the username associated with the UID doesn't really matter. Outside of the container, everything written by the user in the container will have the same UID. A downside would be that the username inside of the container isn't meaningful and could be potentially very misleading to those who are unaware of how this is all being done.

          I've tested the --user=UID:GID option locally on a single-node cluster and have been successful. Files/logs/etc. written in the container using are owned by the user that submitted the job, which is the UID:GID given in the --user option (foo:1000:1000 in the example above).

          There also isn't a problem with usernames being numbers (which the image could map to arbitrary UID/GIDs) because docker interprets all numbers in the --user option as UID/GIDs. I tested this locally to make sure. So even if there is a user named "2000" (with UID != 2000), the command docker run --user=2000 will create a new user with UID 2000.

          Show
          ebadger Eric Badger added a comment - Daniel Templeton , I think you're misunderstanding what I was saying. When we instantiate the DockerRunCommand , we give it a user to run as. In this user remapping case, we've been setting that user to root so that we have permissions to perform the usermod remapping in the container. I'm saying, however, that we could set that use to the UID and GID of the user that submitted the job (i.e. runAsUser in the code). So let's say user foo:1000:1000 submitted a job. The NM will create the launch_container.sh script assuming user foo. Then we will go to launch the docker container. When we instantiate the DockerRunCommand , we would pass it the output of id -u and id -g . This sort of thing is already being done in Zhankun Tang 's patch to get targetUID . The result would give us a command that looks like: docker run --user=1000:1000 ... . There isn't a security hole here that I can see because the user in the container will have the same UID/GID as the user that submitted the job. Inside the container, the username associated with the UID doesn't really matter. Outside of the container, everything written by the user in the container will have the same UID. A downside would be that the username inside of the container isn't meaningful and could be potentially very misleading to those who are unaware of how this is all being done. I've tested the --user=UID:GID option locally on a single-node cluster and have been successful. Files/logs/etc. written in the container using are owned by the user that submitted the job, which is the UID:GID given in the --user option (foo:1000:1000 in the example above). There also isn't a problem with usernames being numbers (which the image could map to arbitrary UID/GIDs) because docker interprets all numbers in the --user option as UID/GIDs. I tested this locally to make sure. So even if there is a user named "2000" (with UID != 2000), the command docker run --user=2000 will create a new user with UID 2000.
          Hide
          tangzhankun Zhankun Tang added a comment -

          Great thanks to Eric Badger and Daniel Templeton. Sorry for the late reply.
          Eric Badger, I had proposed the "--user=UID:GID" way at the very beginning. Please refer to here for the details.

          Show
          tangzhankun Zhankun Tang added a comment - Great thanks to Eric Badger and Daniel Templeton . Sorry for the late reply. Eric Badger , I had proposed the "--user=UID:GID" way at the very beginning. Please refer to here for the details.
          Hide
          ebadger Eric Badger added a comment -

          Zhankun Tang, thanks for pointing that out. I hadn't seen that conversation.

          It seems that the major issue with using --user=UID:GID is that there is no username. But is there any reason that we can't just add in an environment variable to the docker run command that is set to the username and then run a usermod to change the username of the associated UID? Usernames are just cosmetic and everything is done via UIDs, so I don't think it makes sense to run the docker container based on a username.

          Something like:
          docker run --user=2000 -e USERNAME=*username crafted in code*

          And then in the container startup command (with the container running as root):
          usermod -l $USERNAME $(getent passwd "1001" | cut -d: -f1) && su $USERNAME

          There are probably more efficient ways to do this, but this is just a general idea and proof of concept.

          The main problem that I can see with this method is if there is already a user in the image associated with the UID of the user on the host. In that case, we would need to remap the UID of the user in the image to something different before we could do the usermod (or else we would have potential permissions issues inside the container). However, this would also be easy to do.

          Sidharta Seethana, Daniel Templeton, Varun Vasudev, Zhongyue Nah, you were all very active on YARN-5360. Do you have any thoughts on the approach above given my explanation?

          Show
          ebadger Eric Badger added a comment - Zhankun Tang , thanks for pointing that out. I hadn't seen that conversation. It seems that the major issue with using --user=UID:GID is that there is no username. But is there any reason that we can't just add in an environment variable to the docker run command that is set to the username and then run a usermod to change the username of the associated UID? Usernames are just cosmetic and everything is done via UIDs, so I don't think it makes sense to run the docker container based on a username. Something like: docker run --user=2000 -e USERNAME=*username crafted in code* And then in the container startup command (with the container running as root): usermod -l $USERNAME $(getent passwd "1001" | cut -d: -f1) && su $USERNAME There are probably more efficient ways to do this, but this is just a general idea and proof of concept. The main problem that I can see with this method is if there is already a user in the image associated with the UID of the user on the host. In that case, we would need to remap the UID of the user in the image to something different before we could do the usermod (or else we would have potential permissions issues inside the container). However, this would also be easy to do. Sidharta Seethana , Daniel Templeton , Varun Vasudev , Zhongyue Nah , you were all very active on YARN-5360 . Do you have any thoughts on the approach above given my explanation?
          Hide
          sidharta-s Sidharta Seethana added a comment -

          Based on the discussions in this JIRA and on YARN-5360, it looks like all we have are less than ideal choices. Like I mentioned on YARN-5360, using the uid has readability issues and it still wouldn’t guarantee that an image would work correctly. In my opinion, we shouldn’t be adding more requirements on images - the whole objective of this jira was to remove a requirement where possible (--user). launch_container.sh already uses bash, ln, cp, chmod, ls, find. To this list we are considering adding usermod, su, getent and so on. In addition to this we are considering making (expensive) changes to a container prior to launching the application process - usermod only changes the files in a user’s home directory and even then we still have no way of predicting how long this operation would take - making application (process) launch time unpredictable. IMO, This is not the direction we should be going in.

          In the interest of making some progress, perhaps we could add support for optionally using -user=<uid>:<foo>(turned off by default). A subset of images that wouldn’t otherwise work, would be usable because of this change - for example : images that don’t have the user being specified (say foo) but would otherwise work with an arbitrary user (i.e the values supplied in -user=<uid_of_foo>:<gid_of_foo> don’t matter).

          I might have said this on other JIRAs and I’ll repeat here : docker containers and applications using them are just one category of workloads that are going to be run on a production YARN cluster. While we would like to use as much of the power and flexibility that docker provides, we have to do this with due consideration given to existing YARN/hadoop paradigms - security model (users/groups/permissions), localization, log aggregation and so on.

          Show
          sidharta-s Sidharta Seethana added a comment - Based on the discussions in this JIRA and on YARN-5360 , it looks like all we have are less than ideal choices. Like I mentioned on YARN-5360 , using the uid has readability issues and it still wouldn’t guarantee that an image would work correctly. In my opinion, we shouldn’t be adding more requirements on images - the whole objective of this jira was to remove a requirement where possible ( --user ). launch_container.sh already uses bash, ln, cp, chmod, ls, find. To this list we are considering adding usermod, su, getent and so on. In addition to this we are considering making (expensive) changes to a container prior to launching the application process - usermod only changes the files in a user’s home directory and even then we still have no way of predicting how long this operation would take - making application (process) launch time unpredictable. IMO, This is not the direction we should be going in. In the interest of making some progress, perhaps we could add support for optionally using - user=<uid>:<foo> (turned off by default). A subset of images that wouldn’t otherwise work, would be usable because of this change - for example : images that don’t have the user being specified (say foo) but would otherwise work with an arbitrary user (i.e the values supplied in -user=<uid_of_foo>:<gid_of_foo> don’t matter). I might have said this on other JIRAs and I’ll repeat here : docker containers and applications using them are just one category of workloads that are going to be run on a production YARN cluster. While we would like to use as much of the power and flexibility that docker provides, we have to do this with due consideration given to existing YARN/hadoop paradigms - security model (users/groups/permissions), localization, log aggregation and so on.
          Hide
          shanekumpf@gmail.com Shane Kumpf added a comment -

          I took another look at the progress being made on user namespaces in Docker and as far as I can tell, the story remains the same. I echo Sidharta Seethana, it just doesn't appear there is a solution here that will work for all container types. As Daniel Templeton pointed out, "Modifying the container is not a valid alternative to modifying the container", but we are limited on options here.

          As it appears the proposed solution will solve the problem for a class of container types, I'm +1 on adding the UID/usermod approach as an optional solution. Note that this solution won't help for official docker hub images such as postgres and apache without some sort of setuid wrapper, so we'll need to continue to discuss how we handle those.

          I do believe that docker logs is worth exploring as a means of reducing or eliminating the writable bind mounted directories. We could explore read-only mounts for the various caches. It seems the biggest hurdle there will be the secure tokens, but read-only may work here as well. Anyone already thought about this far enough to have a story for the tokens? Should we open a new ticket to discuss this approach?

          Show
          shanekumpf@gmail.com Shane Kumpf added a comment - I took another look at the progress being made on user namespaces in Docker and as far as I can tell, the story remains the same. I echo Sidharta Seethana , it just doesn't appear there is a solution here that will work for all container types. As Daniel Templeton pointed out, "Modifying the container is not a valid alternative to modifying the container", but we are limited on options here. As it appears the proposed solution will solve the problem for a class of container types, I'm +1 on adding the UID/usermod approach as an optional solution. Note that this solution won't help for official docker hub images such as postgres and apache without some sort of setuid wrapper, so we'll need to continue to discuss how we handle those. I do believe that docker logs is worth exploring as a means of reducing or eliminating the writable bind mounted directories. We could explore read-only mounts for the various caches. It seems the biggest hurdle there will be the secure tokens, but read-only may work here as well. Anyone already thought about this far enough to have a story for the tokens? Should we open a new ticket to discuss this approach?
          Hide
          ebadger Eric Badger added a comment -

          Sidharta Seethana, I agree with your assessment. I don't see this "--user" workaround to be the longterm solution, especially if the goal is to allow users to supply their own arbitrary, untrusted images. As others have identified previously in this jira, I believe that the real solution is to use user namespace remapping, which was introduced in Docker 1.10. However, that requires a more updated kernel (3.10) than I think most of us are on, especially in production.

          So, until then I think that allowing an arbitrary UID:GID (or even user:group) to enter the container will be sufficient (disabled by default, as you suggested). Though I believe that containers working in this way are under the big assumption that the image is trusted and well-crafted, which is necessary until we figure out the user remapping issue, resolve security concerns, etc.

          Show
          ebadger Eric Badger added a comment - Sidharta Seethana , I agree with your assessment. I don't see this "--user" workaround to be the longterm solution, especially if the goal is to allow users to supply their own arbitrary, untrusted images. As others have identified previously in this jira, I believe that the real solution is to use user namespace remapping , which was introduced in Docker 1.10. However, that requires a more updated kernel (3.10) than I think most of us are on, especially in production. So, until then I think that allowing an arbitrary UID:GID (or even user:group) to enter the container will be sufficient (disabled by default, as you suggested). Though I believe that containers working in this way are under the big assumption that the image is trusted and well-crafted, which is necessary until we figure out the user remapping issue, resolve security concerns, etc.
          Hide
          ebadger Eric Badger added a comment -

          Looks like I should have reloaded the page before commenting as I didn't see Shane Kumpf's comment. It's unfortunate that docker's namespace remapping doesn't work for multiple users. I guess a single-user namespace emapping would help the problem, but not solve all use cases.

          Show
          ebadger Eric Badger added a comment - Looks like I should have reloaded the page before commenting as I didn't see Shane Kumpf 's comment. It's unfortunate that docker's namespace remapping doesn't work for multiple users. I guess a single-user namespace emapping would help the problem, but not solve all use cases.
          Hide
          ebadger Eric Badger added a comment -

          Zhankun Tang, would you like to put up an updated version of your patch given the consensus that we seem to have around adding the off-by-default "--user=UID:GID" option (without the usermod stuff)? If not, I'd be happy to do the legwork.

          Show
          ebadger Eric Badger added a comment - Zhankun Tang , would you like to put up an updated version of your patch given the consensus that we seem to have around adding the off-by-default "--user=UID:GID" option (without the usermod stuff)? If not, I'd be happy to do the legwork.
          Hide
          tangzhankun Zhankun Tang added a comment -

          Sorry, Eric Badger, I'd like to help but I'm out of office until mid of the next week. Is this ok for you? If not, you can go ahead to do it. Thanks.

          Show
          tangzhankun Zhankun Tang added a comment - Sorry, Eric Badger , I'd like to help but I'm out of office until mid of the next week. Is this ok for you? If not, you can go ahead to do it. Thanks.
          Hide
          ebadger Eric Badger added a comment -

          Zhankun Tang, I'm ok waiting until next week. I just want to make sure that we keep the ball rolling and moving this forward instead of stagnating. I can test the patch out once you upload, hopefully sometime late next week.

          Show
          ebadger Eric Badger added a comment - Zhankun Tang , I'm ok waiting until next week. I just want to make sure that we keep the ball rolling and moving this forward instead of stagnating. I can test the patch out once you upload, hopefully sometime late next week.
          Hide
          ebadger Eric Badger added a comment -

          Zhankun Tang, have had had any time to look into this further?

          Show
          ebadger Eric Badger added a comment - Zhankun Tang , have had had any time to look into this further?
          Hide
          tangzhankun Zhankun Tang added a comment -

          Eric Badger, Yes. We're currently working on it. luhuichun will update the patch soon.

          Show
          tangzhankun Zhankun Tang added a comment - Eric Badger , Yes. We're currently working on it. luhuichun will update the patch soon.
          Hide
          ebadger Eric Badger added a comment -

          Hey Zhankun Tang, luhuichun, wondering if there's any update/what the status of this is. Do you have any sort of target date. Thanks!

          Show
          ebadger Eric Badger added a comment - Hey Zhankun Tang , luhuichun , wondering if there's any update/what the status of this is. Do you have any sort of target date. Thanks!
          Hide
          luhuichun luhuichun added a comment -

          update

          Show
          luhuichun luhuichun added a comment - update
          Hide
          luhuichun luhuichun added a comment -

          Eric Badger Hi Eric, sorry for late response, we are working on another JIRA last week, here is the first patch

          Show
          luhuichun luhuichun added a comment - Eric Badger Hi Eric, sorry for late response, we are working on another JIRA last week, here is the first patch
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 30s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          0 mvndep 0m 14s Maven dependency ordering for branch
          +1 mvninstall 14m 29s trunk passed
          +1 compile 14m 41s trunk passed
          +1 checkstyle 1m 3s trunk passed
          +1 mvnsite 1m 22s trunk passed
          +1 mvneclipse 0m 51s trunk passed
          -1 findbugs 1m 0s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager in trunk has 5 extant Findbugs warnings.
          +1 javadoc 1m 5s trunk passed
          0 mvndep 0m 11s Maven dependency ordering for patch
          -1 mvninstall 0m 19s hadoop-yarn-server-nodemanager in the patch failed.
          -1 compile 1m 5s hadoop-yarn in the patch failed.
          -1 javac 1m 5s hadoop-yarn in the patch failed.
          -0 checkstyle 0m 51s hadoop-yarn-project/hadoop-yarn: The patch generated 4 new + 207 unchanged - 1 fixed = 211 total (was 208)
          -1 mvnsite 0m 21s hadoop-yarn-server-nodemanager in the patch failed.
          +1 mvneclipse 0m 30s the patch passed
          -1 whitespace 0m 0s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
          -1 findbugs 0m 19s hadoop-yarn-server-nodemanager in the patch failed.
          +1 javadoc 0m 43s the patch passed
          -1 unit 0m 29s hadoop-yarn-api in the patch failed.
          -1 unit 0m 19s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 30s The patch does not generate ASF License warnings.
          53m 8s



          Reason Tests
          Failed junit tests hadoop.yarn.conf.TestYarnConfigurationFields



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ac17dc
          JIRA Issue YARN-4266
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12864772/YARN-4266.001.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux e889b734bfdc 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 9460721
          Default Java 1.8.0_121
          findbugs v3.1.0-RC1
          findbugs https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html
          mvninstall https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          compile https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/patch-compile-hadoop-yarn-project_hadoop-yarn.txt
          javac https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/patch-compile-hadoop-yarn-project_hadoop-yarn.txt
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
          mvnsite https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/patch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          whitespace https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/whitespace-eol.txt
          findbugs https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/patch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/15770/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/15770/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 30s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. 0 mvndep 0m 14s Maven dependency ordering for branch +1 mvninstall 14m 29s trunk passed +1 compile 14m 41s trunk passed +1 checkstyle 1m 3s trunk passed +1 mvnsite 1m 22s trunk passed +1 mvneclipse 0m 51s trunk passed -1 findbugs 1m 0s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager in trunk has 5 extant Findbugs warnings. +1 javadoc 1m 5s trunk passed 0 mvndep 0m 11s Maven dependency ordering for patch -1 mvninstall 0m 19s hadoop-yarn-server-nodemanager in the patch failed. -1 compile 1m 5s hadoop-yarn in the patch failed. -1 javac 1m 5s hadoop-yarn in the patch failed. -0 checkstyle 0m 51s hadoop-yarn-project/hadoop-yarn: The patch generated 4 new + 207 unchanged - 1 fixed = 211 total (was 208) -1 mvnsite 0m 21s hadoop-yarn-server-nodemanager in the patch failed. +1 mvneclipse 0m 30s the patch passed -1 whitespace 0m 0s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply -1 findbugs 0m 19s hadoop-yarn-server-nodemanager in the patch failed. +1 javadoc 0m 43s the patch passed -1 unit 0m 29s hadoop-yarn-api in the patch failed. -1 unit 0m 19s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 30s The patch does not generate ASF License warnings. 53m 8s Reason Tests Failed junit tests hadoop.yarn.conf.TestYarnConfigurationFields Subsystem Report/Notes Docker Image:yetus/hadoop:0ac17dc JIRA Issue YARN-4266 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12864772/YARN-4266.001.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux e889b734bfdc 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 9460721 Default Java 1.8.0_121 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html mvninstall https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt compile https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/patch-compile-hadoop-yarn-project_hadoop-yarn.txt javac https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/patch-compile-hadoop-yarn-project_hadoop-yarn.txt checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt mvnsite https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/patch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt whitespace https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/whitespace-eol.txt findbugs https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/patch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/15770/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/15770/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn Console output https://builds.apache.org/job/PreCommit-YARN-Build/15770/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          ebadger Eric Badger added a comment -

          luhuichun, looks like there's some compilation problems with the patch

          Show
          ebadger Eric Badger added a comment - luhuichun , looks like there's some compilation problems with the patch
          Hide
          ebadger Eric Badger added a comment -

          I'm working on a new patch based on the one from Zhankun Tang and luhuichun, but would like some comments. In the patch I use the --user option to set the uid:gid of the user in the docker run command. After massaging the environment variables, installing the necessary packages in the image, and etc. this almost works perfectly. However, MRAppMaster tries to do a user lookup and fails because the user has no name, only a uid. So for jobs that run through the MRAppMaster, they will fail when trying to use a specific uid in the container. This doesn't necessarily apply to other jobs that could be run in the docker container on YARN, so I think that this change is still an improvement over the current implementation for some users.

          But clearly, my goal here is to run through the MRAppMaster with an arbitrary uid:gid pair. To do this, I propose mounting /var/run/nscd so that the docker container can lookup users via the host according to whatever method is defined in nsswitch.conf on the host. glibc will automatically go to the nscd socket to see if there is a service listening. If a service is listening (such as nscd or sssd on the host), then the lookup can leverage them to do the lookup. This gives us the ability to do remote authentication via ldap. The downside, of course, is that we're now bind mounting another directory, and a socket nonetheless. So I'm very interested in comments on this approach.

          cc Shane Kumpf, Daniel Templeton, Varun Vasudev, Sidharta Seethana

          Show
          ebadger Eric Badger added a comment - I'm working on a new patch based on the one from Zhankun Tang and luhuichun , but would like some comments. In the patch I use the --user option to set the uid:gid of the user in the docker run command. After massaging the environment variables, installing the necessary packages in the image, and etc. this almost works perfectly. However, MRAppMaster tries to do a user lookup and fails because the user has no name, only a uid. So for jobs that run through the MRAppMaster, they will fail when trying to use a specific uid in the container. This doesn't necessarily apply to other jobs that could be run in the docker container on YARN, so I think that this change is still an improvement over the current implementation for some users. But clearly, my goal here is to run through the MRAppMaster with an arbitrary uid:gid pair. To do this, I propose mounting /var/run/nscd so that the docker container can lookup users via the host according to whatever method is defined in nsswitch.conf on the host. glibc will automatically go to the nscd socket to see if there is a service listening. If a service is listening (such as nscd or sssd on the host), then the lookup can leverage them to do the lookup. This gives us the ability to do remote authentication via ldap. The downside, of course, is that we're now bind mounting another directory, and a socket nonetheless. So I'm very interested in comments on this approach. cc Shane Kumpf , Daniel Templeton , Varun Vasudev , Sidharta Seethana

            People

            • Assignee:
              luhuichun luhuichun
              Reporter:
              sidharta-s Sidharta Seethana
            • Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:

                Development