Details

    • Type: Sub-task
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: yarn
    • Labels:
      None

      Description

      We need the ability to localize docker images when those images aren't already available locally. There are various approaches that could be used here with different trade-offs/issues : image archives on HDFS + docker load , docker pull during the localization phase or (automatic) docker pull during the run/launch phase.

      We also need the ability to clean-up old/stale, unused images.

        Activity

        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        Thanks, luhuichun!

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - Thanks, luhuichun !
        Hide
        luhuichun luhuichun added a comment -

        Shane Kumpf Hi Shane, it's ok for me to transfer

        Show
        luhuichun luhuichun added a comment - Shane Kumpf Hi Shane, it's ok for me to transfer
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        @luhuichun Zhankun Tang - Thanks for your efforts here. I'd really like to get this wrapped up and would be happy to take this over. Let me know if that would be ok?

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - @luhuichun Zhankun Tang - Thanks for your efforts here. I'd really like to get this wrapped up and would be happy to take this over. Let me know if that would be ok?
        Hide
        tangzhankun Zhankun Tang added a comment -

        Thanks a lot. Shane.

        Show
        tangzhankun Zhankun Tang added a comment - Thanks a lot. Shane.
        Hide
        tangzhankun Zhankun Tang added a comment -

        I have created two sub-tasks which is YARN-5669 and YARN-5670. This JIRA will keep for Docker type local resource implementation.

        Show
        tangzhankun Zhankun Tang added a comment - I have created two sub-tasks which is YARN-5669 and YARN-5670 . This JIRA will keep for Docker type local resource implementation.
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        Zhankun Tang Just wanted to note that those sub-tasks look good to me and I can assist once they are opened. Thanks!

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - Zhankun Tang Just wanted to note that those sub-tasks look good to me and I can assist once they are opened. Thanks!
        Hide
        tangzhankun Zhankun Tang added a comment -

        Varun Vasudev, thanks for the review.
        Yes. I want to go ahead and start the implementation. It would be great if Shane Kumpf can support.

        How about we create below sub-tasks in this JIRA (or YARN-3611?)
        1. Add support for Docker pull command
        2. Add Docker type local resource to enable Docker image localization
        3. Add support for Docker image clean up

        Show
        tangzhankun Zhankun Tang added a comment - Varun Vasudev , thanks for the review. Yes. I want to go ahead and start the implementation. It would be great if Shane Kumpf can support. How about we create below sub-tasks in this JIRA (or YARN-3611 ?) 1. Add support for Docker pull command 2. Add Docker type local resource to enable Docker image localization 3. Add support for Docker image clean up
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        Today, when credential issues occur during the implicit docker pull, the "image <image name> not found" error is hidden away in the NM logs. Initially my thought was that failing fast and exposing the error to the user via the application logs would be sufficient without credential validation, however, after testing, docker pull has some differing behaviors regarding various credential issue scenerios.

        If the credentials file is missing or the image does not exist in the registry, docker pull returns an "image <image name> not found" error (and rc=1). As a result, it will be difficult to validate if the user supplied a non-existent image or if it's a missing credentials problem. This deficiency could have been easily addressed via a log message and some documentation though.

        If the credentials exist, but are incorrect (incorrect auth token), it's more challenging to address as docker falls back to the interactive docker login prompt. Ideally, docker would provide a command line option to avoid this fall back to make the behavior consistent, and it may be worth starting that discussion within the docker community towards adding this option. Otherwise, we would need to try to handle detecting the difference between a long running docker pull and a docker pull that is waiting for the user to input credentials.

        I'll note that, unfortunately, I'm not aware of any options in the docker client CLI to validate credentials. It may be possible by resorting to checks against the registry REST API, but ideally, we want to stick with the docker client CLI as we've done so far. I'll look into a clean way to address validation.

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - Today, when credential issues occur during the implicit docker pull, the "image <image name> not found" error is hidden away in the NM logs. Initially my thought was that failing fast and exposing the error to the user via the application logs would be sufficient without credential validation, however, after testing, docker pull has some differing behaviors regarding various credential issue scenerios. If the credentials file is missing or the image does not exist in the registry, docker pull returns an "image <image name> not found" error (and rc=1). As a result, it will be difficult to validate if the user supplied a non-existent image or if it's a missing credentials problem. This deficiency could have been easily addressed via a log message and some documentation though. If the credentials exist, but are incorrect (incorrect auth token), it's more challenging to address as docker falls back to the interactive docker login prompt. Ideally, docker would provide a command line option to avoid this fall back to make the behavior consistent, and it may be worth starting that discussion within the docker community towards adding this option. Otherwise, we would need to try to handle detecting the difference between a long running docker pull and a docker pull that is waiting for the user to input credentials. I'll note that, unfortunately, I'm not aware of any options in the docker client CLI to validate credentials. It may be possible by resorting to checks against the registry REST API, but ideally, we want to stick with the docker client CLI as we've done so far. I'll look into a clean way to address validation.
        Hide
        vvasudev Varun Vasudev added a comment -

        Zhankun Tang - the general approach looks ok to me. Do you want to go ahead and file tickets for the tasks and start the implementation?

        Show
        vvasudev Varun Vasudev added a comment - Zhankun Tang - the general approach looks ok to me. Do you want to go ahead and file tickets for the tasks and start the implementation?
        Hide
        tangzhankun Zhankun Tang added a comment -

        Daniel Templeton, Thanks for the review.
        My current plan is that we don't validate the credentials in this patch. The credentials may be more suitable to be validated in YARN-5428 (the application supplied config.json way).
        I guess docker pull operation will fail fast if there is no credentials prepared. Then the container is eventually failed.

        Show
        tangzhankun Zhankun Tang added a comment - Daniel Templeton , Thanks for the review. My current plan is that we don't validate the credentials in this patch. The credentials may be more suitable to be validated in YARN-5428 (the application supplied config.json way). I guess docker pull operation will fail fast if there is no credentials prepared. Then the container is eventually failed.
        Hide
        tangzhankun Zhankun Tang added a comment -

        We need to delete the application scoped Docker image in local host once the application finished. So this hint file is used for storing Docker images which might be deleted at some point by DeletionService. I guess we should do it this way but this is not the final version. Any possible options are welcomed.

        Show
        tangzhankun Zhankun Tang added a comment - We need to delete the application scoped Docker image in local host once the application finished. So this hint file is used for storing Docker images which might be deleted at some point by DeletionService. I guess we should do it this way but this is not the final version. Any possible options are welcomed.
        Hide
        jianhe Jian He added a comment -

        Thanks Zhankun Tang, question on this:

        leave a hint file in corresponding resource local directory to record the image name.

        What is this hint file used for ?

        Show
        jianhe Jian He added a comment - Thanks Zhankun Tang , question on this: leave a hint file in corresponding resource local directory to record the image name. What is this hint file used for ?
        Hide
        templedf Daniel Templeton added a comment -

        Thanks, Zhankun Tang. I just took a look through the proposal, and it looks good. In the implementation section, you mention that you're solution for dealing with credentials is to assume that proper credentials exist. Are you planning to validate or enforce that assumption? If not, the implementation will have to deal with what happens when the credentials don't exist or aren't correct.

        Show
        templedf Daniel Templeton added a comment - Thanks, Zhankun Tang . I just took a look through the proposal, and it looks good. In the implementation section, you mention that you're solution for dealing with credentials is to assume that proper credentials exist. Are you planning to validate or enforce that assumption? If not, the implementation will have to deal with what happens when the credentials don't exist or aren't correct.
        Hide
        tangzhankun Zhankun Tang added a comment -

        A new proposal for docker pull during localization phase. Please review

        Show
        tangzhankun Zhankun Tang added a comment - A new proposal for docker pull during localization phase. Please review
        Hide
        tangzhankun Zhankun Tang added a comment -

        Varun Vasudev, Yes. Proposal is in progress

        Show
        tangzhankun Zhankun Tang added a comment - Varun Vasudev , Yes. Proposal is in progress
        Hide
        vvasudev Varun Vasudev added a comment -

        Zhankun Tang - I think it's best if we go with the docker pull approach.

        Show
        vvasudev Varun Vasudev added a comment - Zhankun Tang - I think it's best if we go with the docker pull approach.
        Hide
        tangzhankun Zhankun Tang added a comment -

        Yes. It seems that our direction is towards "docker pull" during localization. Will we just discard the "HDFS+ docker load" way and design it based on "docker pull" ?

        Show
        tangzhankun Zhankun Tang added a comment - Yes. It seems that our direction is towards "docker pull" during localization. Will we just discard the "HDFS+ docker load" way and design it based on "docker pull" ?
        Hide
        sidharta-s Sidharta Seethana added a comment -

        YARN-3289 was filed in the context of DockerContainerExecutor - though the discussion there is still quite relevant. My preference would be to keep all discussions related to features/fixes here under the YARN-3611 umbrella JIRA .

        Show
        sidharta-s Sidharta Seethana added a comment - YARN-3289 was filed in the context of DockerContainerExecutor - though the discussion there is still quite relevant. My preference would be to keep all discussions related to features/fixes here under the YARN-3611 umbrella JIRA .
        Hide
        tangzhankun Zhankun Tang added a comment -

        Sidharta Seethana, Thanks for the update!
        For the security issue exposed by "docker load", my conclusion after investigation is that it seems no way to solve this. We cannot ensure the tarball "docker load" imported is trusted even if we managed to solve the tag name conflicts.

        And docker pull during localization seems duplicated with YARN-3289.

        Show
        tangzhankun Zhankun Tang added a comment - Sidharta Seethana , Thanks for the update! For the security issue exposed by "docker load", my conclusion after investigation is that it seems no way to solve this. We cannot ensure the tarball "docker load" imported is trusted even if we managed to solve the tag name conflicts. And docker pull during localization seems duplicated with YARN-3289 .
        Hide
        sidharta-s Sidharta Seethana added a comment -

        I updated the description based on our discussion here.

        Show
        sidharta-s Sidharta Seethana added a comment - I updated the description based on our discussion here.
        Hide
        vvasudev Varun Vasudev added a comment -

        Agree with the docker pull approach. Couple of things to keep in mind - as Shane Kumpf mentioned above, docker pulls can become extremely slow - we should have probably have some support for timing out pulls and failing the container. On the flip side, we should also clean up unused images to manage disk space. We could fall back on the existing YARN application resource scopes - public, application, container - to help figure out when to clean up the images. This would also seem to go with Daniel Templeton's suggestion of making the pull part of the localization.

        Show
        vvasudev Varun Vasudev added a comment - Agree with the docker pull approach. Couple of things to keep in mind - as Shane Kumpf mentioned above, docker pulls can become extremely slow - we should have probably have some support for timing out pulls and failing the container. On the flip side, we should also clean up unused images to manage disk space. We could fall back on the existing YARN application resource scopes - public, application, container - to help figure out when to clean up the images. This would also seem to go with Daniel Templeton 's suggestion of making the pull part of the localization.
        Hide
        tangzhankun Zhankun Tang added a comment -

        Yes. Explicit "docker pull" have the benifit over implicitly that we can know this "pull" process is not successful and we can kill it. If delegate this to the "docker run" implicitly we have no way to distingush why we failed to launch the container.

        Show
        tangzhankun Zhankun Tang added a comment - Yes. Explicit "docker pull" have the benifit over implicitly that we can know this "pull" process is not successful and we can kill it. If delegate this to the "docker run" implicitly we have no way to distingush why we failed to launch the container.
        Hide
        templedf Daniel Templeton added a comment -

        My concern here is that it's not easy to control explicit docker pull because there seems no interface for us to check the progress or pause/cancel the pull process currently.

        As far as I can tell, the docker pull CLI gives you basic control. It blocks until the pull is complete, and it you kill the CLI, it kills the pull. You could add the pull as a command to the container-executor and run it from the localizer.

        So it seems no big differences versus just let Docker do it implicitly.

        Except that when it's done implicitly the clock is ticking, so a large image can cause the container to appear stalled, as far as I understand it.

        A big issue is that this patch has security risk as mentioned by Sidharta Seethana.

        Yes, the answer is to have Docker always pull a new image. That's a common practice.

        Show
        templedf Daniel Templeton added a comment - My concern here is that it's not easy to control explicit docker pull because there seems no interface for us to check the progress or pause/cancel the pull process currently. As far as I can tell, the docker pull CLI gives you basic control. It blocks until the pull is complete, and it you kill the CLI, it kills the pull. You could add the pull as a command to the container-executor and run it from the localizer. So it seems no big differences versus just let Docker do it implicitly. Except that when it's done implicitly the clock is ticking, so a large image can cause the container to appear stalled, as far as I understand it. A big issue is that this patch has security risk as mentioned by Sidharta Seethana. Yes, the answer is to have Docker always pull a new image. That's a common practice.
        Hide
        tangzhankun Zhankun Tang added a comment - - edited

        Shane Kumpf, thanks and I totally agree with your goals in the Docker images localization topic. And whether we use HDFS distributed cache or HDFS backed private repo is fine to me. I also mentioned the private repo way in the doc and saying that this doc and patch is a solution for the users who don't want to maintain private repo. I believe that a well maintained Docker private repo will be good choice for many people and don't need YARN to do extra work for it.

        For the docker pull while one are pushing new version image, I think it's a rolling update problem. The new version should have a new tag. And the administrator manually rolling update the application will be ok.

        Let's back to this patch. In essence, "HDFS + save/load" tries to mimic the private Docker repo. There are two parts to consider in the whole process. This patch brings in extra steps/issues due to the simplicity:

        • 1. Docker image generation and upload to a storage
          • This patch uses an extra "docker save" step comparing with Docker on image generation. And it needs the application remember the URI while Docker just need one to know the tag name after upload to the storage.
        • 2. Image distribution/localization
          • This patch is distributing a tar file through distributed cache so it's hard to speed up distribution by only download delta like Docker pull. And it consumes more network bandwidth.
          • A big issue is that this patch has security risk as mentioned by Sidharta Seethana. Thanks Sidharta pointing this out that I don't realized before. Because potential tag name conflicts, different users may replace each other's Docker images. Currently, we cannot avoid this due to YARN have no way to distinguish tag names of two Docker images tar files. YARN only know this is a Docker image tar file, but cannot know whether load it will cause other's image replaced. Although there's also no tag name conflicts check when we use "docker push", administrator can avoid this conflicts when pushing so that each image has unique tag name. Anyway, it's a fact that this patch opens a hole for user to attack existing Docker images. One way to solve this is adding a option in Docker to avoid force load if the tag name is already exists.

        To sum up, this patch eliminates the needs for setup private repo, but brings extra works to admin/application and have potential risk due to attack surface of Docker load. I'll raise this issue to Docker and thanks again, folks. And I think we should be more clear to the motivation of this JIRA, Sidharta Seethana. Thoughts?

        Show
        tangzhankun Zhankun Tang added a comment - - edited Shane Kumpf , thanks and I totally agree with your goals in the Docker images localization topic. And whether we use HDFS distributed cache or HDFS backed private repo is fine to me. I also mentioned the private repo way in the doc and saying that this doc and patch is a solution for the users who don't want to maintain private repo. I believe that a well maintained Docker private repo will be good choice for many people and don't need YARN to do extra work for it. For the docker pull while one are pushing new version image, I think it's a rolling update problem. The new version should have a new tag. And the administrator manually rolling update the application will be ok. Let's back to this patch. In essence, "HDFS + save/load" tries to mimic the private Docker repo. There are two parts to consider in the whole process. This patch brings in extra steps/issues due to the simplicity: 1. Docker image generation and upload to a storage This patch uses an extra "docker save" step comparing with Docker on image generation. And it needs the application remember the URI while Docker just need one to know the tag name after upload to the storage. 2. Image distribution/localization This patch is distributing a tar file through distributed cache so it's hard to speed up distribution by only download delta like Docker pull. And it consumes more network bandwidth. A big issue is that this patch has security risk as mentioned by Sidharta Seethana . Thanks Sidharta pointing this out that I don't realized before. Because potential tag name conflicts, different users may replace each other's Docker images. Currently, we cannot avoid this due to YARN have no way to distinguish tag names of two Docker images tar files. YARN only know this is a Docker image tar file, but cannot know whether load it will cause other's image replaced. Although there's also no tag name conflicts check when we use "docker push", administrator can avoid this conflicts when pushing so that each image has unique tag name. Anyway, it's a fact that this patch opens a hole for user to attack existing Docker images. One way to solve this is adding a option in Docker to avoid force load if the tag name is already exists. To sum up, this patch eliminates the needs for setup private repo, but brings extra works to admin/application and have potential risk due to attack surface of Docker load. I'll raise this issue to Docker and thanks again, folks. And I think we should be more clear to the motivation of this JIRA, Sidharta Seethana . Thoughts?
        Hide
        tangzhankun Zhankun Tang added a comment -

        Daniel Templeton, thanks for pointing out the extra efforts of "docker save" the images and upload them into HDFS. These steps with extra upload HDFS network bandwith consuming are indeed unavoidable for this HDFS+save/load approach.

        If we use "docker pull" before launching container to prepare image, it becomes a topic of how to have fine-grained Docker image pull control in YARN when utilizing private repo (whether its storage backed is HDFS or not).

        Explicit pull should have potential benifit if YARN can be aware of the pull progress and manipulate the process. My concern here is that it's not easy to control explicit docker pull because there seems no interface for us to check the progress or pause/cancel the pull process currently. In this situation, explicit "docker pull" in YARN can only check if the explicit docker pull is timeout then maybe kill it. These timeout check already exists for launching container. So it seems no big differences versus just let Docker do it implicitly. Thoughts?

        Show
        tangzhankun Zhankun Tang added a comment - Daniel Templeton , thanks for pointing out the extra efforts of "docker save" the images and upload them into HDFS. These steps with extra upload HDFS network bandwith consuming are indeed unavoidable for this HDFS+save/load approach. If we use "docker pull" before launching container to prepare image, it becomes a topic of how to have fine-grained Docker image pull control in YARN when utilizing private repo (whether its storage backed is HDFS or not). Explicit pull should have potential benifit if YARN can be aware of the pull progress and manipulate the process. My concern here is that it's not easy to control explicit docker pull because there seems no interface for us to check the progress or pause/cancel the pull process currently. In this situation, explicit "docker pull" in YARN can only check if the explicit docker pull is timeout then maybe kill it. These timeout check already exists for launching container. So it seems no big differences versus just let Docker do it implicitly. Thoughts?
        Hide
        sidharta-s Sidharta Seethana added a comment -

        I had spent some time looking into this as well. My biggest concern with using HDFS + save/load is that it is very easy for users to clobber each others images. This can lead to some hard to trace bugs and also has security implications. Here is one example - UserA is using fedora:latest on a given node. UserB comes along, localizes their own image file and triggers a docker load. If this image is also tagged fedora:latest - the existing tag is replaced. This is shown below (on a single node - the example is somewhat contrived, but it demonstrates the issue I am describing )

        sidharta-laptop (13:11:11) /tmp$ docker images | grep ^centos
        centos                       centos6             ae6154bcd445        9 weeks ago         194.5 MB
        sidharta-laptop (13:11:13) /tmp$ docker images | grep ^fedora
        sidharta-laptop (13:11:35) /tmp$ docker tag ae6154bcd445 fedora:latest
        sidharta-laptop (13:12:40) /tmp$ docker images | grep ^fedora
        fedora                       latest              ae6154bcd445        9 weeks ago         194.5 MB
        sidharta-laptop (13:12:42) /tmp$ docker save fedora:latest > fedora.tar
        sidharta-laptop (13:14:05) /tmp$ docker rmi fedora:latest
        Untagged: fedora:latest
        sidharta-laptop (13:14:19) /tmp$ docker pull fedora:latest
        latest: Pulling from library/fedora
        
        7c91a140e7a1: Pull complete
        Digest: sha256:a97914edb6ba15deb5c5acf87bd6bd5b6b0408c96f48a5cbd450b5b04509bb7d
        Status: Downloaded newer image for fedora:latest
        sidharta-laptop (13:14:37) /tmp$ docker images | grep ^fedora
        fedora                       latest              f9873d530588        4 weeks ago         204.4 MB
        sidharta-laptop (13:14:43) /tmp$ docker load -i fedora.tar
        The image fedora:latest already exists, renaming the old one with ID sha256:f9873d530588316311ac1d3d15e95487b947f5d8b560e72bdd6eb73a7831b2c4 to empty string
        Loaded image: fedora:latest
        sidharta-laptop (13:15:01) /tmp$ docker images | grep ^fedora
        fedora                       <none>              f9873d530588        4 weeks ago         204.4 MB
        fedora                       latest              ae6154bcd445        9 weeks ago         194.5 MB
        sidharta-laptop (13:15:15) /tmp$
        

        It is very easy for a UserB to wreak havoc on UserA by reusing tags. You can see above that the 'official' fedora:latest tag got replaced by one loaded from an archive. Using a registry directly alleviates some of the concerns - but there are other aspects to consider as mentioned by Daniel Templeton and Shane Kumpf.

        Show
        sidharta-s Sidharta Seethana added a comment - I had spent some time looking into this as well. My biggest concern with using HDFS + save/load is that it is very easy for users to clobber each others images. This can lead to some hard to trace bugs and also has security implications. Here is one example - UserA is using fedora:latest on a given node. UserB comes along, localizes their own image file and triggers a docker load. If this image is also tagged fedora:latest - the existing tag is replaced. This is shown below (on a single node - the example is somewhat contrived, but it demonstrates the issue I am describing ) sidharta-laptop (13:11:11) /tmp$ docker images | grep ^centos centos centos6 ae6154bcd445 9 weeks ago 194.5 MB sidharta-laptop (13:11:13) /tmp$ docker images | grep ^fedora sidharta-laptop (13:11:35) /tmp$ docker tag ae6154bcd445 fedora:latest sidharta-laptop (13:12:40) /tmp$ docker images | grep ^fedora fedora latest ae6154bcd445 9 weeks ago 194.5 MB sidharta-laptop (13:12:42) /tmp$ docker save fedora:latest > fedora.tar sidharta-laptop (13:14:05) /tmp$ docker rmi fedora:latest Untagged: fedora:latest sidharta-laptop (13:14:19) /tmp$ docker pull fedora:latest latest: Pulling from library/fedora 7c91a140e7a1: Pull complete Digest: sha256:a97914edb6ba15deb5c5acf87bd6bd5b6b0408c96f48a5cbd450b5b04509bb7d Status: Downloaded newer image for fedora:latest sidharta-laptop (13:14:37) /tmp$ docker images | grep ^fedora fedora latest f9873d530588 4 weeks ago 204.4 MB sidharta-laptop (13:14:43) /tmp$ docker load -i fedora.tar The image fedora:latest already exists, renaming the old one with ID sha256:f9873d530588316311ac1d3d15e95487b947f5d8b560e72bdd6eb73a7831b2c4 to empty string Loaded image: fedora:latest sidharta-laptop (13:15:01) /tmp$ docker images | grep ^fedora fedora <none> f9873d530588 4 weeks ago 204.4 MB fedora latest ae6154bcd445 9 weeks ago 194.5 MB sidharta-laptop (13:15:15) /tmp$ It is very easy for a UserB to wreak havoc on UserA by reusing tags. You can see above that the 'official' fedora:latest tag got replaced by one loaded from an archive. Using a registry directly alleviates some of the concerns - but there are other aspects to consider as mentioned by Daniel Templeton and Shane Kumpf .
        Hide
        templedf Daniel Templeton added a comment -

        Today, when the docker run in container executor is issued, a docker pull is run behind the scenes, similar to what you are suggesting.

        As I understand it, the pull is implicit in the run when the image is not already present. The problem as I understand it is that the container can appear to be stalled because of a slow image download. What I was suggesting was making the pull explicit and to drive it from the localization framework. When the pull is complete, then the container can be launched.

        Show
        templedf Daniel Templeton added a comment - Today, when the docker run in container executor is issued, a docker pull is run behind the scenes, similar to what you are suggesting. As I understand it, the pull is implicit in the run when the image is not already present. The problem as I understand it is that the container can appear to be stalled because of a slow image download. What I was suggesting was making the pull explicit and to drive it from the localization framework. When the pull is complete, then the container can be launched.
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        It appears I had them backwards wrt save/load and export/import. Save/load is the right approach. Pls disregard that part.

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - It appears I had them backwards wrt save/load and export/import. Save/load is the right approach. Pls disregard that part.
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        Zhankun Tang thanks for the patch and doc. It echos many of my concerns.

        I've given image localization and management quite a bit of thought, and so far I haven't come up with a great solution here. Some of the goals I had in mind that, IMO, should be carried forward are to minimize dependence on the internet, get the container started as fast as possible, ensure the same image is used for the duration of an application, and maintain the image's metadata.

        Daniel Templeton Today, when the docker run in container executor is issued, a docker pull is run behind the scenes, similar to what you are suggesting. The potential for timeouts is high in unstable networks. This also doesn't work for docker hub private repositories, but that is a separate issue that needs to be filed.

        One comment on the approach here, IIRC, docker export/import also retains the history and layers, whereas save/load flattens, so we should likely use export/import vs save/load.

        The approach outlined in the patch does have its merits. You are not dependent on being able to pull from docker hub or a private registry and could ensure that the same image is run by all of the tasks in the job. I believe it would be possible to keep the image metadata intact as well.

        My concerns with using Dockerhub/a private registry is what happens during a long running job if someone pushes a new "latest" to the registry? Would the docker pull result in the last part of my application running a different image (perhaps that doesn't apply to what you have in mind)? However, I completely agree with Daniel's concerns on the current approach, plus it's extra work administrators now have to do to get the images packaged and into HDFS.

        Somewhat OT, but I started on a HDFS storage plugin for the docker registry storage driver API, but the API was changing daily, so I put this on the back burner waiting for a bit more stabilization - docker-registry-hdfs if you want to play with it. It allows for doing a docker pull from a private registry backed by HDFS. This would help satisfy the goal of not depending on the internet/docker hub and maintaining the image's metadata, but beyond that it doesn't buy us much. I'm hopeful there are interesting "hacks" where this might provide benefits to YARN in the future.

        Perhaps image management and localization should be handled outside of the application lifecycle? Otherwise, image localization could introduce significant lag for starting containers (which may be OK?).

        Interested in other's thoughts here.

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - Zhankun Tang thanks for the patch and doc. It echos many of my concerns. I've given image localization and management quite a bit of thought, and so far I haven't come up with a great solution here. Some of the goals I had in mind that, IMO, should be carried forward are to minimize dependence on the internet, get the container started as fast as possible, ensure the same image is used for the duration of an application, and maintain the image's metadata. Daniel Templeton Today, when the docker run in container executor is issued, a docker pull is run behind the scenes, similar to what you are suggesting. The potential for timeouts is high in unstable networks. This also doesn't work for docker hub private repositories, but that is a separate issue that needs to be filed. One comment on the approach here, IIRC, docker export/import also retains the history and layers, whereas save/load flattens, so we should likely use export/import vs save/load. The approach outlined in the patch does have its merits. You are not dependent on being able to pull from docker hub or a private registry and could ensure that the same image is run by all of the tasks in the job. I believe it would be possible to keep the image metadata intact as well. My concerns with using Dockerhub/a private registry is what happens during a long running job if someone pushes a new "latest" to the registry? Would the docker pull result in the last part of my application running a different image (perhaps that doesn't apply to what you have in mind)? However, I completely agree with Daniel's concerns on the current approach, plus it's extra work administrators now have to do to get the images packaged and into HDFS. Somewhat OT, but I started on a HDFS storage plugin for the docker registry storage driver API, but the API was changing daily, so I put this on the back burner waiting for a bit more stabilization - docker-registry-hdfs if you want to play with it. It allows for doing a docker pull from a private registry backed by HDFS. This would help satisfy the goal of not depending on the internet/docker hub and maintaining the image's metadata, but beyond that it doesn't buy us much. I'm hopeful there are interesting "hacks" where this might provide benefits to YARN in the future. Perhaps image management and localization should be handled outside of the application lifecycle? Otherwise, image localization could introduce significant lag for starting containers (which may be OK?). Interested in other's thoughts here.
        Hide
        templedf Daniel Templeton added a comment -

        Zhankun Tang, thanks for the doc and patch.

        It appears that this patch focuses only on using the distributed cache as the vehicle for sharing Docker images. While that works, it would be more desirable to have a Docker pull happen from a Docker repo. Otherwise the user has to pull the image out of the repo to stuff it into HDFS, tripling the total amount of data moved over the network (repo->client,client->HDFS,HDFS->NM rather than repo->NM). The image files may be very large, making the extra data movement and storage a potential issue. This approach also creates the potential for image version mismatches and forces the user to manage the image layers. For these reasons, I think it would make more sense to have the localization process execute a Docker pull for the image on the target host instead.

        When considering localization through a Docker pull command, you should also consider how to deal with secure registries.

        Show
        templedf Daniel Templeton added a comment - Zhankun Tang , thanks for the doc and patch. It appears that this patch focuses only on using the distributed cache as the vehicle for sharing Docker images. While that works, it would be more desirable to have a Docker pull happen from a Docker repo. Otherwise the user has to pull the image out of the repo to stuff it into HDFS, tripling the total amount of data moved over the network (repo->client,client->HDFS,HDFS->NM rather than repo->NM). The image files may be very large, making the extra data movement and storage a potential issue. This approach also creates the potential for image version mismatches and forces the user to manage the image layers. For these reasons, I think it would make more sense to have the localization process execute a Docker pull for the image on the target host instead. When considering localization through a Docker pull command, you should also consider how to deal with secure registries.
        Hide
        sidharta-s Sidharta Seethana added a comment -

        Zhankun Tang, thanks for uploading the doc and the patch. I'll take a look at them both and get back to you.

        Show
        sidharta-s Sidharta Seethana added a comment - Zhankun Tang , thanks for uploading the doc and the patch. I'll take a look at them both and get back to you.
        Hide
        tangzhankun Zhankun Tang added a comment -

        One more thing to note that is this patch needs a config in container-executor.cfg to run docker command:

        docker.binary=/usr/bin/docker

        Show
        tangzhankun Zhankun Tang added a comment - One more thing to note that is this patch needs a config in container-executor.cfg to run docker command: docker.binary=/usr/bin/docker
        Hide
        tangzhankun Zhankun Tang added a comment -

        Sidharta Seethana, Daniel Templeton, Jun Gong, Zhongyue Nah, I have implemented a draft patch for your review. Current results of "test-patch":

        Vote Subsystem Runtime Comment
        ============================================================================
        0 findbugs 0m 0s Findbugs executables are not available.
        +1 @author 0m 0s The patch does not contain any @author
              tags.
        +1 test4tests 0m 0s The patch appears to include 2 new or
              modified test files.
        +1 mvninstall 3m 20s branch-2.8 passed
        +1 compile 0m 15s branch-2.8 passed
        +1 checkstyle 0m 10s branch-2.8 passed
        +1 mvnsite 0m 15s branch-2.8 passed
        +1 mvneclipse 0m 7s branch-2.8 passed
        +1 javadoc 0m 9s branch-2.8 passed
        +1 mvninstall 0m 12s the patch passed
        +1 compile 0m 14s the patch passed
        +1 javac 0m 14s the patch passed
        -1 checkstyle 0m 9s
              hadoop-yarn-project/hadoop-yarn/hadoop-yar
              n-server/hadoop-yarn-server-nodemanager:
              The patch generated 1 new + 175 unchanged
             
        • 1 fixed = 176 total (was 176)
        +1 mvnsite 0m 14s the patch passed
        +1 mvneclipse 0m 5s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 javadoc 0m 9s the patch passed
        -1 unit 9m 3s hadoop-yarn-server-nodemanager in the
              patch failed.
        +1 asflicense 0m 8s The patch does not generate ASF License
              warnings.
            14m 51s

        Reason | Tests
        Failed junit tests | hadoop.yarn.server.nodemanager.TestLocalDirsHandlerService

        hadoop.yarn.server.nodemanager.TestDirectoryCollection
        hadoop.yarn.server.nodemanager.TestNodeStatusUpdater
        hadoop.yarn.server.nodemanager.containermanager.TestNMProxy

        For the checkstyle -1:
        Because the "call()" method in ContainerLaunch.java has legacy style issue that exceed the method length (209, max allowed 150), any lines added in this method will cause this "-1". I think we should refactor this "call" method or just change the max allowed length.

        For the test case failures:
        The three failed test cases exists even in my original hadoop branch 2.8. I don't know the reason now but I guess it has no business with my patch.

        Hope for your advice.

        Show
        tangzhankun Zhankun Tang added a comment - Sidharta Seethana , Daniel Templeton , Jun Gong , Zhongyue Nah , I have implemented a draft patch for your review. Current results of "test-patch": Vote Subsystem Runtime Comment ============================================================================ 0 findbugs 0m 0s Findbugs executables are not available. +1 @author 0m 0s The patch does not contain any @author       tags. +1 test4tests 0m 0s The patch appears to include 2 new or       modified test files. +1 mvninstall 3m 20s branch-2.8 passed +1 compile 0m 15s branch-2.8 passed +1 checkstyle 0m 10s branch-2.8 passed +1 mvnsite 0m 15s branch-2.8 passed +1 mvneclipse 0m 7s branch-2.8 passed +1 javadoc 0m 9s branch-2.8 passed +1 mvninstall 0m 12s the patch passed +1 compile 0m 14s the patch passed +1 javac 0m 14s the patch passed -1 checkstyle 0m 9s       hadoop-yarn-project/hadoop-yarn/hadoop-yar       n-server/hadoop-yarn-server-nodemanager:       The patch generated 1 new + 175 unchanged       1 fixed = 176 total (was 176) +1 mvnsite 0m 14s the patch passed +1 mvneclipse 0m 5s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 javadoc 0m 9s the patch passed -1 unit 9m 3s hadoop-yarn-server-nodemanager in the       patch failed. +1 asflicense 0m 8s The patch does not generate ASF License       warnings.     14m 51s Reason | Tests Failed junit tests | hadoop.yarn.server.nodemanager.TestLocalDirsHandlerService hadoop.yarn.server.nodemanager.TestDirectoryCollection hadoop.yarn.server.nodemanager.TestNodeStatusUpdater hadoop.yarn.server.nodemanager.containermanager.TestNMProxy For the checkstyle -1: Because the "call()" method in ContainerLaunch.java has legacy style issue that exceed the method length (209, max allowed 150), any lines added in this method will cause this "-1". I think we should refactor this "call" method or just change the max allowed length. For the test case failures: The three failed test cases exists even in my original hadoop branch 2.8. I don't know the reason now but I guess it has no business with my patch. Hope for your advice.
        Hide
        tangzhankun Zhankun Tang added a comment -

        draft version patch for your review.

        Show
        tangzhankun Zhankun Tang added a comment - draft version patch for your review.
        Hide
        tangzhankun Zhankun Tang added a comment -

        There is an existing environment that can be used for indicating Docker image resources.

        Show
        tangzhankun Zhankun Tang added a comment - There is an existing environment that can be used for indicating Docker image resources.
        Hide
        tangzhankun Zhankun Tang added a comment -

        Sidharta Seethana, Jun Gong, Zhongyue Nah, I have filed a draft design document, please review and comment.

        Show
        tangzhankun Zhankun Tang added a comment - Sidharta Seethana , Jun Gong , Zhongyue Nah , I have filed a draft design document, please review and comment.
        Hide
        sidharta-s Sidharta Seethana added a comment -

        Zhankun Tang, I have assigned the ticket to you. Thanks for helping!

        Show
        sidharta-s Sidharta Seethana added a comment - Zhankun Tang , I have assigned the ticket to you. Thanks for helping!
        Hide
        tangzhankun Zhankun Tang added a comment -

        Sidharta Seethana],
        Yeah. Thanks. We have implemented a workaround for this "docker load" in our YARN application. So I'll update with our LCE approach for discussing soon, you can assign this ticket to me.

        Show
        tangzhankun Zhankun Tang added a comment - Sidharta Seethana ], Yeah. Thanks. We have implemented a workaround for this "docker load" in our YARN application. So I'll update with our LCE approach for discussing soon, you can assign this ticket to me.
        Hide
        sidharta-s Sidharta Seethana added a comment -

        Zhongyue Nah, Zhankun Tang,

        We haven't had much time to look into this - it would you be great if you can help! Would you mind updating this ticket with a brief writeup of your approach ? I could also reassign this ticket to one of you, please let me know.

        Show
        sidharta-s Sidharta Seethana added a comment - Zhongyue Nah , Zhankun Tang , We haven't had much time to look into this - it would you be great if you can help! Would you mind updating this ticket with a brief writeup of your approach ? I could also reassign this ticket to one of you, please let me know.
        Hide
        tangzhankun Zhankun Tang added a comment -

        @Sidharta Seethana, Is there any progress on this docker image localization? YARN-3289 has no recent updates too.

        Show
        tangzhankun Zhankun Tang added a comment - @ Sidharta Seethana , Is there any progress on this docker image localization? YARN-3289 has no recent updates too.
        Hide
        zyluo Zhongyue Nah added a comment -

        Vinod Kumar Vavilapalli I think YARN-3289 is about supporting diff pulls from NMs while this is to support 'docker load' from LCE?

        Sidharta Seethana Do you have any updates for this? We have implemented 'docker load' on LCE and wondering if you mind us submitting patches.

        Show
        zyluo Zhongyue Nah added a comment - Vinod Kumar Vavilapalli I think YARN-3289 is about supporting diff pulls from NMs while this is to support 'docker load' from LCE? Sidharta Seethana Do you have any updates for this? We have implemented 'docker load' on LCE and wondering if you mind us submitting patches.
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Sidharta Seethana, this one's old but is it a dup of YARN-3289? If the goal is the same, we should close the newer of the two and change JIRA hierarchy etc of the older one as needed.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Sidharta Seethana , this one's old but is it a dup of YARN-3289 ? If the goal is the same, we should close the newer of the two and change JIRA hierarchy etc of the older one as needed.
        Hide
        hex108 Jun Gong added a comment -

        I think what we need is a private registry. Push local image to private registry when(or before) submitting app, then NM could pull it form the private registry.

        BTW: We could build the private registry using HDFS as the storage backend(https://github.com/hex108/docker-registry-driver-hdfs), and it works well in our cluster.

        Show
        hex108 Jun Gong added a comment - I think what we need is a private registry. Push local image to private registry when(or before) submitting app, then NM could pull it form the private registry. BTW: We could build the private registry using HDFS as the storage backend( https://github.com/hex108/docker-registry-driver-hdfs ), and it works well in our cluster.

          People

          • Assignee:
            shanekumpf@gmail.com Shane Kumpf
            Reporter:
            sidharta-s Sidharta Seethana
          • Votes:
            0 Vote for this issue
            Watchers:
            21 Start watching this issue

            Dates

            • Created:
              Updated:

              Development