Details

    • Type: Sub-task
    • Status: Patch Available
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: yarn
    • Labels:

      Description

      The docker client allows for specifying a configuration directory that contains the docker client's configuration. It is common to store "docker login" credentials in this config, to avoid the need to docker login on each cluster member.

      By default the docker client config is $HOME/.docker/config.json on Linux. However, this does not work with the current container executor user switching and it may also be desirable to centralize this configuration beyond the single user's home directory.

      Note that the command line arg is for the configuration directory NOT the configuration file.

      This change will be needed to allow YARN to automatically pull images at localization time or within container executor.

      1. YARN-5428.001.patch
        7 kB
        Shane Kumpf
      2. YARN-5428.002.patch
        8 kB
        Shane Kumpf
      3. YARN-5428.003.patch
        8 kB
        Shane Kumpf
      4. YARN-5428.004.patch
        8 kB
        Shane Kumpf
      5. YARN-5428Allowforspecifyingthedockerclientconfigurationdirectory.pdf
        46 kB
        Shane Kumpf

        Activity

        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        Attaching a patch that allows for specifying the docker client configuration directory.

        It adds the configuration parameter yarn.nodemanager.runtime.linux.docker.client-config-directory to yarn-site to allow the admin to specify the directory where the docker configuration file is stored on each node.

        Note that the --config=<client config directory> option is for docker CLI, and needs to come before the action (run, load, pull, etc), so it was added to the DockerCommand abstract class. This will prove useful for other options that need to be prepended to the docker command line args, such as debug. The config option has only been added to launchContainer, as that is where the implicit docker pull occurs and the credentials would be needed.

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - Attaching a patch that allows for specifying the docker client configuration directory. It adds the configuration parameter yarn.nodemanager.runtime.linux.docker.client-config-directory to yarn-site to allow the admin to specify the directory where the docker configuration file is stored on each node. Note that the --config=<client config directory> option is for docker CLI, and needs to come before the action (run, load, pull, etc), so it was added to the DockerCommand abstract class. This will prove useful for other options that need to be prepended to the docker command line args, such as debug. The config option has only been added to launchContainer, as that is where the implicit docker pull occurs and the credentials would be needed.
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        The more I think about this, we should probably add a call to setClientConfigDir in the constructor, if the configuration property is set. There really isn't a reason that all commands shouldn't use the prepended args as well. I'll upload a new patch with this, unless anyone feels differently.

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - The more I think about this, we should probably add a call to setClientConfigDir in the constructor, if the configuration property is set. There really isn't a reason that all commands shouldn't use the prepended args as well. I'll upload a new patch with this, unless anyone feels differently.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 14s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
        0 mvndep 0m 57s Maven dependency ordering for branch
        +1 mvninstall 8m 20s trunk passed
        +1 compile 2m 45s trunk passed
        +1 checkstyle 0m 45s trunk passed
        +1 mvnsite 1m 6s trunk passed
        +1 mvneclipse 0m 28s trunk passed
        +1 findbugs 1m 53s trunk passed
        +1 javadoc 0m 37s trunk passed
        0 mvndep 0m 9s Maven dependency ordering for patch
        +1 mvninstall 0m 53s the patch passed
        +1 compile 2m 44s the patch passed
        +1 javac 2m 44s the patch passed
        -1 checkstyle 0m 43s hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 230 unchanged - 0 fixed = 231 total (was 230)
        +1 mvnsite 0m 57s the patch passed
        +1 mvneclipse 0m 22s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 2m 14s the patch passed
        +1 javadoc 0m 36s the patch passed
        -1 unit 0m 27s hadoop-yarn-api in the patch failed.
        -1 unit 13m 26s hadoop-yarn-server-nodemanager in the patch failed.
        +1 asflicense 0m 17s The patch does not generate ASF License warnings.
        40m 43s



        Reason Tests
        Failed junit tests hadoop.yarn.conf.TestYarnConfigurationFields
          hadoop.yarn.server.nodemanager.TestDirectoryCollection



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:9560f25
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12820062/YARN-5428.001.patch
        JIRA Issue YARN-5428
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux f326a2a08d07 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / d383bfd
        Default Java 1.8.0_101
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/12495/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
        unit https://builds.apache.org/job/PreCommit-YARN-Build/12495/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api.txt
        unit https://builds.apache.org/job/PreCommit-YARN-Build/12495/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
        unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/12495/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api.txt https://builds.apache.org/job/PreCommit-YARN-Build/12495/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/12495/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/12495/console
        Powered by Apache Yetus 0.3.0 http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. 0 mvndep 0m 57s Maven dependency ordering for branch +1 mvninstall 8m 20s trunk passed +1 compile 2m 45s trunk passed +1 checkstyle 0m 45s trunk passed +1 mvnsite 1m 6s trunk passed +1 mvneclipse 0m 28s trunk passed +1 findbugs 1m 53s trunk passed +1 javadoc 0m 37s trunk passed 0 mvndep 0m 9s Maven dependency ordering for patch +1 mvninstall 0m 53s the patch passed +1 compile 2m 44s the patch passed +1 javac 2m 44s the patch passed -1 checkstyle 0m 43s hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 230 unchanged - 0 fixed = 231 total (was 230) +1 mvnsite 0m 57s the patch passed +1 mvneclipse 0m 22s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 14s the patch passed +1 javadoc 0m 36s the patch passed -1 unit 0m 27s hadoop-yarn-api in the patch failed. -1 unit 13m 26s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 17s The patch does not generate ASF License warnings. 40m 43s Reason Tests Failed junit tests hadoop.yarn.conf.TestYarnConfigurationFields   hadoop.yarn.server.nodemanager.TestDirectoryCollection Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12820062/YARN-5428.001.patch JIRA Issue YARN-5428 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux f326a2a08d07 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / d383bfd Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/12495/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/12495/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/12495/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/12495/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api.txt https://builds.apache.org/job/PreCommit-YARN-Build/12495/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/12495/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn Console output https://builds.apache.org/job/PreCommit-YARN-Build/12495/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        After taking a harder looking, adding this to the DockerCommand constructor will require passing in the Configuration to DockerCommand. This is not ideal and therefore it was left up to the caller to add this property as needed. I have updated the two places where DockerCommand is used to include the config arg.

        Regarding the checkstyle error, it appears that launchContainer was already at the limit of 150 lines. It doesn't seem appropriate to fix this here and a new issue could be opened to refactor launchContainer to get it under the line limit.

        The hadoop-yarn-api failure is because this is an optional configuration, with no default value since the client already has a default (~/.docker). Updated the test to ignore this property. The nodemanager failure looks to be unrelated to this patch.

        Uploading a new patch that addresses the failure and adds the appropriate calls when the docker client config is needed.

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - After taking a harder looking, adding this to the DockerCommand constructor will require passing in the Configuration to DockerCommand. This is not ideal and therefore it was left up to the caller to add this property as needed. I have updated the two places where DockerCommand is used to include the config arg. Regarding the checkstyle error, it appears that launchContainer was already at the limit of 150 lines. It doesn't seem appropriate to fix this here and a new issue could be opened to refactor launchContainer to get it under the line limit. The hadoop-yarn-api failure is because this is an optional configuration, with no default value since the client already has a default (~/.docker). Updated the test to ignore this property. The nodemanager failure looks to be unrelated to this patch. Uploading a new patch that addresses the failure and adds the appropriate calls when the docker client config is needed.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 18s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
        0 mvndep 0m 9s Maven dependency ordering for branch
        +1 mvninstall 8m 36s trunk passed
        +1 compile 2m 41s trunk passed
        +1 checkstyle 0m 40s trunk passed
        +1 mvnsite 1m 9s trunk passed
        +1 mvneclipse 0m 27s trunk passed
        +1 findbugs 2m 14s trunk passed
        +1 javadoc 0m 41s trunk passed
        0 mvndep 0m 11s Maven dependency ordering for patch
        +1 mvninstall 1m 0s the patch passed
        +1 compile 3m 8s the patch passed
        +1 javac 3m 8s the patch passed
        -1 checkstyle 0m 38s hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 236 unchanged - 1 fixed = 238 total (was 237)
        +1 mvnsite 0m 49s the patch passed
        +1 mvneclipse 0m 21s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 1m 57s the patch passed
        +1 javadoc 0m 32s the patch passed
        +1 unit 0m 24s hadoop-yarn-api in the patch passed.
        -1 unit 12m 58s hadoop-yarn-server-nodemanager in the patch failed.
        +1 asflicense 0m 18s The patch does not generate ASF License warnings.
        40m 0s



        Reason Tests
        Failed junit tests hadoop.yarn.server.nodemanager.TestDirectoryCollection



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:9560f25
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12820173/YARN-5428.002.patch
        JIRA Issue YARN-5428
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 387e729fdb8c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / f1a4863
        Default Java 1.8.0_101
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/12505/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
        unit https://builds.apache.org/job/PreCommit-YARN-Build/12505/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
        unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/12505/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/12505/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/12505/console
        Powered by Apache Yetus 0.3.0 http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 18s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. 0 mvndep 0m 9s Maven dependency ordering for branch +1 mvninstall 8m 36s trunk passed +1 compile 2m 41s trunk passed +1 checkstyle 0m 40s trunk passed +1 mvnsite 1m 9s trunk passed +1 mvneclipse 0m 27s trunk passed +1 findbugs 2m 14s trunk passed +1 javadoc 0m 41s trunk passed 0 mvndep 0m 11s Maven dependency ordering for patch +1 mvninstall 1m 0s the patch passed +1 compile 3m 8s the patch passed +1 javac 3m 8s the patch passed -1 checkstyle 0m 38s hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 236 unchanged - 1 fixed = 238 total (was 237) +1 mvnsite 0m 49s the patch passed +1 mvneclipse 0m 21s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 57s the patch passed +1 javadoc 0m 32s the patch passed +1 unit 0m 24s hadoop-yarn-api in the patch passed. -1 unit 12m 58s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 18s The patch does not generate ASF License warnings. 40m 0s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.TestDirectoryCollection Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12820173/YARN-5428.002.patch JIRA Issue YARN-5428 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 387e729fdb8c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / f1a4863 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/12505/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/12505/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/12505/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/12505/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn Console output https://builds.apache.org/job/PreCommit-YARN-Build/12505/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        As previously mentioned, the checkstyle failures are because two methods were already at 150 lines, and my additions put them over. We should refactor these in another issue. The unit test failure appears unrelated. I believe this is ready for review.

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - As previously mentioned, the checkstyle failures are because two methods were already at 150 lines, and my additions put them over. We should refactor these in another issue. The unit test failure appears unrelated. I believe this is ready for review.
        Hide
        vvasudev Varun Vasudev added a comment -

        Thanks for the patch Shane Kumpf. Some feedback -

        1)

        +    String clientConfigDir = conf.get(
        +        YarnConfiguration.NM_DOCKER_CLIENT_CONFIG_DIRECTORY);
        +    if(clientConfigDir != null) {
        +      runCommand.setClientConfigDir(clientConfigDir);
        +    }
        +
        
        +
        +      String clientConfigDir = conf.get(
        +          YarnConfiguration.NM_DOCKER_CLIENT_CONFIG_DIRECTORY);
        +      if(clientConfigDir != null) {
        +        stopCommand.setClientConfigDir(clientConfigDir);
        +      }
        +
        

        Can we just move the null check into setClientConfigDir itself?

        Rest of the patch looks good to me.

        Show
        vvasudev Varun Vasudev added a comment - Thanks for the patch Shane Kumpf . Some feedback - 1) + String clientConfigDir = conf.get( + YarnConfiguration.NM_DOCKER_CLIENT_CONFIG_DIRECTORY); + if (clientConfigDir != null ) { + runCommand.setClientConfigDir(clientConfigDir); + } + + + String clientConfigDir = conf.get( + YarnConfiguration.NM_DOCKER_CLIENT_CONFIG_DIRECTORY); + if (clientConfigDir != null ) { + stopCommand.setClientConfigDir(clientConfigDir); + } + Can we just move the null check into setClientConfigDir itself? Rest of the patch looks good to me.
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        Thanks for the review Varun Vasudev! I'll get a new patch uploaded shortly with that fix.

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - Thanks for the review Varun Vasudev ! I'll get a new patch uploaded shortly with that fix.
        Hide
        vvasudev Varun Vasudev added a comment -

        Shane Kumpf - sorry I forgot to ask - does the client configuration directory have to be passed for commands like docker pull, docker inspect, etc? We run these commands in the container-executor.

        Show
        vvasudev Varun Vasudev added a comment - Shane Kumpf - sorry I forgot to ask - does the client configuration directory have to be passed for commands like docker pull, docker inspect, etc? We run these commands in the container-executor.
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        While it would be a good idea to pass this to all docker commands, it is not necessary for the use case in mind. The types of configuration typically stored in the client config are formatting rules, http headers, credentials, and shortcuts. The main one we are concerned with is credentials, which are only needed by a docker run/pull, and properly handled by container executor, as the --config option is added to the docker command file. Hopefully we can address a broader subset of docker client commands once container executor is refactored.

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - While it would be a good idea to pass this to all docker commands, it is not necessary for the use case in mind. The types of configuration typically stored in the client config are formatting rules, http headers, credentials, and shortcuts. The main one we are concerned with is credentials, which are only needed by a docker run/pull, and properly handled by container executor, as the --config option is added to the docker command file. Hopefully we can address a broader subset of docker client commands once container executor is refactored.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 15s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
        0 mvndep 0m 10s Maven dependency ordering for branch
        +1 mvninstall 6m 53s trunk passed
        +1 compile 2m 19s trunk passed
        +1 checkstyle 0m 42s trunk passed
        +1 mvnsite 0m 52s trunk passed
        +1 mvneclipse 0m 24s trunk passed
        +1 findbugs 1m 40s trunk passed
        +1 javadoc 0m 34s trunk passed
        0 mvndep 0m 10s Maven dependency ordering for patch
        +1 mvninstall 0m 44s the patch passed
        +1 compile 2m 17s the patch passed
        +1 javac 2m 17s the patch passed
        -1 checkstyle 0m 38s hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 237 unchanged - 1 fixed = 239 total (was 238)
        +1 mvnsite 0m 49s the patch passed
        +1 mvneclipse 0m 21s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 1m 55s the patch passed
        +1 javadoc 0m 31s the patch passed
        +1 unit 0m 23s hadoop-yarn-api in the patch passed.
        -1 unit 13m 6s hadoop-yarn-server-nodemanager in the patch failed.
        +1 asflicense 0m 18s The patch does not generate ASF License warnings.
        35m 47s



        Reason Tests
        Failed junit tests hadoop.yarn.server.nodemanager.TestDirectoryCollection



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:9560f25
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12820484/YARN-5428.003.patch
        JIRA Issue YARN-5428
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 19789f0fc0cf 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 54fe17a
        Default Java 1.8.0_101
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/12522/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
        unit https://builds.apache.org/job/PreCommit-YARN-Build/12522/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
        unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/12522/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/12522/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/12522/console
        Powered by Apache Yetus 0.3.0 http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 15s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. 0 mvndep 0m 10s Maven dependency ordering for branch +1 mvninstall 6m 53s trunk passed +1 compile 2m 19s trunk passed +1 checkstyle 0m 42s trunk passed +1 mvnsite 0m 52s trunk passed +1 mvneclipse 0m 24s trunk passed +1 findbugs 1m 40s trunk passed +1 javadoc 0m 34s trunk passed 0 mvndep 0m 10s Maven dependency ordering for patch +1 mvninstall 0m 44s the patch passed +1 compile 2m 17s the patch passed +1 javac 2m 17s the patch passed -1 checkstyle 0m 38s hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 237 unchanged - 1 fixed = 239 total (was 238) +1 mvnsite 0m 49s the patch passed +1 mvneclipse 0m 21s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 55s the patch passed +1 javadoc 0m 31s the patch passed +1 unit 0m 23s hadoop-yarn-api in the patch passed. -1 unit 13m 6s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 18s The patch does not generate ASF License warnings. 35m 47s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.TestDirectoryCollection Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12820484/YARN-5428.003.patch JIRA Issue YARN-5428 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 19789f0fc0cf 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 54fe17a Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/12522/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/12522/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/12522/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/12522/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn Console output https://builds.apache.org/job/PreCommit-YARN-Build/12522/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        new patch has been uploaded. The failures are the same as before (method line length and unrelated nodemanager test failure).

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - new patch has been uploaded. The failures are the same as before (method line length and unrelated nodemanager test failure).
        Hide
        vvasudev Varun Vasudev added a comment -

        Thanks for the patch Shane Kumpf. Mostly good, except for some formatting fixes - the 'if' conditions formatting seems off.

        Show
        vvasudev Varun Vasudev added a comment - Thanks for the patch Shane Kumpf . Mostly good, except for some formatting fixes - the 'if' conditions formatting seems off.
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        Varun Vasudev Thanks for the heads up. I've reformatted the patch and am attaching it now. Let me know if you still see any issues. Thanks!

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - Varun Vasudev Thanks for the heads up. I've reformatted the patch and am attaching it now. Let me know if you still see any issues. Thanks!
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 16s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
        0 mvndep 0m 10s Maven dependency ordering for branch
        +1 mvninstall 6m 41s trunk passed
        +1 compile 2m 16s trunk passed
        +1 checkstyle 0m 41s trunk passed
        +1 mvnsite 0m 52s trunk passed
        +1 mvneclipse 0m 25s trunk passed
        +1 findbugs 1m 40s trunk passed
        +1 javadoc 0m 34s trunk passed
        0 mvndep 0m 9s Maven dependency ordering for patch
        +1 mvninstall 0m 44s the patch passed
        +1 compile 2m 13s the patch passed
        +1 javac 2m 13s the patch passed
        -1 checkstyle 0m 38s hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 237 unchanged - 1 fixed = 239 total (was 238)
        +1 mvnsite 0m 49s the patch passed
        +1 mvneclipse 0m 20s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 1m 52s the patch passed
        +1 javadoc 0m 31s the patch passed
        +1 unit 0m 22s hadoop-yarn-api in the patch passed.
        -1 unit 13m 3s hadoop-yarn-server-nodemanager in the patch failed.
        +1 asflicense 0m 18s The patch does not generate ASF License warnings.
        35m 21s



        Reason Tests
        Failed junit tests hadoop.yarn.server.nodemanager.TestDirectoryCollection



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:9560f25
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12821029/YARN-5428.004.patch
        JIRA Issue YARN-5428
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 09b4d0d0ede2 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 95f2b98
        Default Java 1.8.0_101
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/12564/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
        unit https://builds.apache.org/job/PreCommit-YARN-Build/12564/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
        unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/12564/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/12564/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/12564/console
        Powered by Apache Yetus 0.3.0 http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 16s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. 0 mvndep 0m 10s Maven dependency ordering for branch +1 mvninstall 6m 41s trunk passed +1 compile 2m 16s trunk passed +1 checkstyle 0m 41s trunk passed +1 mvnsite 0m 52s trunk passed +1 mvneclipse 0m 25s trunk passed +1 findbugs 1m 40s trunk passed +1 javadoc 0m 34s trunk passed 0 mvndep 0m 9s Maven dependency ordering for patch +1 mvninstall 0m 44s the patch passed +1 compile 2m 13s the patch passed +1 javac 2m 13s the patch passed -1 checkstyle 0m 38s hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 237 unchanged - 1 fixed = 239 total (was 238) +1 mvnsite 0m 49s the patch passed +1 mvneclipse 0m 20s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 52s the patch passed +1 javadoc 0m 31s the patch passed +1 unit 0m 22s hadoop-yarn-api in the patch passed. -1 unit 13m 3s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 18s The patch does not generate ASF License warnings. 35m 21s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.TestDirectoryCollection Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12821029/YARN-5428.004.patch JIRA Issue YARN-5428 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 09b4d0d0ede2 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 95f2b98 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/12564/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/12564/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/12564/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/12564/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn Console output https://builds.apache.org/job/PreCommit-YARN-Build/12564/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
        Hide
        tangzhankun Zhankun Tang added a comment - - edited

        Thanks for the patch, Shane Kumpf. One question:

        I remember "docker login" will store the credentials in ~/.docker/config.json by default. Will this patch eliminate the need of "docker login"? Or should the administrator store credentials in the config.json file manually?

        Show
        tangzhankun Zhankun Tang added a comment - - edited Thanks for the patch, Shane Kumpf . One question: I remember "docker login" will store the credentials in ~/.docker/config.json by default. Will this patch eliminate the need of "docker login"? Or should the administrator store credentials in the config.json file manually?
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        We don't pass down the $HOME environment variable or expand out ~, so the default setting of ~/.docker/config.json will not be honored when running docker containers on YARN.

        This patch will give you choice as to where you store the config.json file. An administrator still needs to deploy config.json to the location specified by this configuration. Deploying the config.json file pre-populated with credentials is an alternative to the interactive "docker login" command. Also note that other client configuration can be stored in config.json, such as formatting rules, http proxy settings and a few others.

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - We don't pass down the $HOME environment variable or expand out ~, so the default setting of ~/.docker/config.json will not be honored when running docker containers on YARN. This patch will give you choice as to where you store the config.json file. An administrator still needs to deploy config.json to the location specified by this configuration. Deploying the config.json file pre-populated with credentials is an alternative to the interactive "docker login" command. Also note that other client configuration can be stored in config.json, such as formatting rules, http proxy settings and a few others.
        Hide
        tangzhankun Zhankun Tang added a comment -

        Yes. Agree. It can store other settings besides credentials. Since the credential won't expire (just base64 encoded, not fetched from server) if username and password doesn't change, the administrator must store it in advance.

        Show
        tangzhankun Zhankun Tang added a comment - Yes. Agree. It can store other settings besides credentials. Since the credential won't expire (just base64 encoded, not fetched from server) if username and password doesn't change, the administrator must store it in advance.
        Hide
        aw Allen Wittenauer added a comment -

        Why would an admin provide creds and not individual users? Why should there be a global store of credentials? What prevents a user from stealing these global creds?

        Show
        aw Allen Wittenauer added a comment - Why would an admin provide creds and not individual users? Why should there be a global store of credentials? What prevents a user from stealing these global creds?
        Hide
        tangzhankun Zhankun Tang added a comment -

        The admin provided creds in config.json is to avoid interactive docker pull I think.

        Indeed, an admin can store several auths for individual repositories. In this case, the application should provide both repo and image name for YARN.

        For the security of global creds, because the config.json can only be owned by root and set permission to 700, I guess no one else can steal it.

        Show
        tangzhankun Zhankun Tang added a comment - The admin provided creds in config.json is to avoid interactive docker pull I think. Indeed, an admin can store several auths for individual repositories. In this case, the application should provide both repo and image name for YARN. For the security of global creds, because the config.json can only be owned by root and set permission to 700, I guess no one else can steal it.
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        To be clear, this patch is enabling the ability to set the docker client's configuration directory. It is up to the administrator if they want to store credentials in that configuration. I'm not advocating a global credential store in any regard. Other client configuration is also stored in config.json.

        To answer your question though, one scenario where this is useful is to allow YARN to automatically pull the image from a docker private repository as needed. A read-only user can be created at Docker hub and given read access to the images they require. By storing this read-only credential in config.json, and setting the property provided in this patch, the administrator or end user no longer needs to "pre-pull" the image on all machines, which is an administrative burden. As Zhankun points out, it is common that the config is owned by root and perms set to 700, but again, it is up to the administrator how they want to leverage the docker client config.

        Hope that helps.

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - To be clear, this patch is enabling the ability to set the docker client's configuration directory. It is up to the administrator if they want to store credentials in that configuration. I'm not advocating a global credential store in any regard. Other client configuration is also stored in config.json. To answer your question though, one scenario where this is useful is to allow YARN to automatically pull the image from a docker private repository as needed. A read-only user can be created at Docker hub and given read access to the images they require. By storing this read-only credential in config.json, and setting the property provided in this patch, the administrator or end user no longer needs to "pre-pull" the image on all machines, which is an administrative burden. As Zhankun points out, it is common that the config is owned by root and perms set to 700, but again, it is up to the administrator how they want to leverage the docker client config. Hope that helps.
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        Just to add that in the YARN model, we run the docker commands as root and hence the file needs to be owned by root with 400 permissions minimum. YARN is also not aware of the actual username and password, only where the client config is stored. The administrator or user would need to run "docker login" interactively at least once to generate the config.json file and distribute it across the cluster if they wanted to leverage the client config for credentials.

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - Just to add that in the YARN model, we run the docker commands as root and hence the file needs to be owned by root with 400 permissions minimum. YARN is also not aware of the actual username and password, only where the client config is stored. The administrator or user would need to run "docker login" interactively at least once to generate the config.json file and distribute it across the cluster if they wanted to leverage the client config for credentials.
        Hide
        aw Allen Wittenauer added a comment -

        ... and none of that answers the key question:

        Why is the admin providing the passwords and not the user? Why can't a user provide a config.json as part of their job submission? All users having the exact same access to every docker registry seems like a huge fail in any large (read: enterprise) environment.

        Show
        aw Allen Wittenauer added a comment - ... and none of that answers the key question: Why is the admin providing the passwords and not the user? Why can't a user provide a config.json as part of their job submission? All users having the exact same access to every docker registry seems like a huge fail in any large (read: enterprise) environment.
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        They most certainly can. The user would set the docker client config directory at job submission time to point to where their personal config.json file lives.

        This is simply setting a client config directory. How you and your users decide to use it/enforce it in your large environment is up to you. This does not force you into storing your credentials in the config. The client config is also used for other client config as well, such as HTTP proxy settings, which could be quite useful to define cluster wide. There is no default setting enforced by YARN.

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - They most certainly can. The user would set the docker client config directory at job submission time to point to where their personal config.json file lives. This is simply setting a client config directory. How you and your users decide to use it/enforce it in your large environment is up to you. This does not force you into storing your credentials in the config. The client config is also used for other client config as well, such as HTTP proxy settings, which could be quite useful to define cluster wide. There is no default setting enforced by YARN.
        Hide
        aw Allen Wittenauer added a comment -

        There's a bunch of different settings in play here. Some are more appropriate for the user to provide (credentials) and others are more appropriate for the admin to provide (e.g., proxy). This is especially true in a split network design. It really sounds like there needs to be a merge operation before getting sent to the cluster to actually use.

        Where's the design doc for all this work? It really feels like stuff is just getting committed without any long term goal or design in place that has been shared.

        Show
        aw Allen Wittenauer added a comment - There's a bunch of different settings in play here. Some are more appropriate for the user to provide (credentials) and others are more appropriate for the admin to provide (e.g., proxy). This is especially true in a split network design. It really sounds like there needs to be a merge operation before getting sent to the cluster to actually use. Where's the design doc for all this work? It really feels like stuff is just getting committed without any long term goal or design in place that has been shared.
        Hide
        tangzhankun Zhankun Tang added a comment -

        Exposing the admin settings in this config file seems inappropriate. I suppose this patch is mainly a prerequisite (credentials in it) for YARN-3854 to pull the image from secured repository without feeding in credentials.

        Allen Wittenauer, if I remember correctly, a single overall config.json can contains all user's credentials for all private repos. So I guess it will be ok that the administrator copy this file to all nodes? This is why I made it an assumption in proposal for YARN-3854 currently.

        It seems that this patch can allow user to pass in customized config.json already, using command "yarn jar app.jar -Dyarn.nodemanager.runtime.linux.docker.client-config-directory=/etc/docker" should works. But the end user's config.json needs also be distributed to all nodes (otherwise, image localization will fail). This file distribution mostly will be done by administrator and become the same burden with a single overall config.json mentioned above.

        Since we are implementing all this together, we can also just pass credentials content to Docker in new designed interface, this requires us to have the content materialized in specific directory's config.json before image localization happens. This needs more discussions.

        Any ideas on passing in credentials content or a single overall config.json, Shane Kumpf, Daniel Templeton?

        Show
        tangzhankun Zhankun Tang added a comment - Exposing the admin settings in this config file seems inappropriate. I suppose this patch is mainly a prerequisite (credentials in it) for YARN-3854 to pull the image from secured repository without feeding in credentials. Allen Wittenauer , if I remember correctly, a single overall config.json can contains all user's credentials for all private repos. So I guess it will be ok that the administrator copy this file to all nodes? This is why I made it an assumption in proposal for YARN-3854 currently. It seems that this patch can allow user to pass in customized config.json already, using command "yarn jar app.jar -Dyarn.nodemanager.runtime.linux.docker.client-config-directory=/etc/docker" should works. But the end user's config.json needs also be distributed to all nodes (otherwise, image localization will fail). This file distribution mostly will be done by administrator and become the same burden with a single overall config.json mentioned above. Since we are implementing all this together, we can also just pass credentials content to Docker in new designed interface, this requires us to have the content materialized in specific directory's config.json before image localization happens. This needs more discussions. Any ideas on passing in credentials content or a single overall config.json, Shane Kumpf , Daniel Templeton ?
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        Sorry for the delay here.

        Allen Wittenauer, Zhankun Tang - Thank you for the feedback. In stepping back, I agree that there are multiple needs here that require additional consideration. I'm attaching a design document that outlines two approaches to meet needs of both administrators and the user. Please let me know your thoughts.

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - Sorry for the delay here. Allen Wittenauer , Zhankun Tang - Thank you for the feedback. In stepping back, I agree that there are multiple needs here that require additional consideration. I'm attaching a design document that outlines two approaches to meet needs of both administrators and the user. Please let me know your thoughts.
        Hide
        tangzhankun Zhankun Tang added a comment -

        Shane Kumpf, thanks for the proposal.

        Just one question about 4.2(Distribute Application Owner Supplied Docker Client Configuration with the Application), how could we ensure that the config.json localization procedure happens before Docker image pull in the same localization phase (coordinate with YARN-3854)? I guess Marathon and K8s don't need to consider this because they assume docker pull will be implicitly invoked.

        If I remember correctly, We could prioritize the docker.tar.gz download theoretically because all LocalResource is downloaded one by one per ContainerLocalizer. But any other idea on this?

        Show
        tangzhankun Zhankun Tang added a comment - Shane Kumpf , thanks for the proposal. Just one question about 4.2(Distribute Application Owner Supplied Docker Client Configuration with the Application), how could we ensure that the config.json localization procedure happens before Docker image pull in the same localization phase (coordinate with YARN-3854 )? I guess Marathon and K8s don't need to consider this because they assume docker pull will be implicitly invoked. If I remember correctly, We could prioritize the docker.tar.gz download theoretically because all LocalResource is downloaded one by one per ContainerLocalizer. But any other idea on this?
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        Thanks for the review Zhankun Tang! Yes, we will need to coordinate to ensure the File, Archive, and Pattern LocalResourceType's are processed before the docker image LocalResourceType.

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - Thanks for the review Zhankun Tang ! Yes, we will need to coordinate to ensure the File, Archive, and Pattern LocalResourceType's are processed before the docker image LocalResourceType.
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        Any other comments on the design doc? If not, I'll get the additional issue opened for the application owner use case.

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - Any other comments on the design doc? If not, I'll get the additional issue opened for the application owner use case.
        Hide
        aw Allen Wittenauer added a comment -

        I'm a little disturbed by how the architecture is pretty much dedicated to talking about directories rather than configuration objects. I think that's sort of building an architecture for a predetermined solution rather than actually thinking about the problem at hand.

        Let's say, that an admin level solution is picked. The code could copy from HADOOP_CONF_DIR/docker (which would be hard-coded) on each node into the localized directory then just pass --config to the docker commands. This eliminates the need for a pretty useless configuration entry. It also means that in the future once it gets fixed to actually do merges, the only code change is going to be the merge.

        Puts the burden on all frameworks that leverage YARN to supply the client configuration.

        I don't think this is necessarily true. If the client configuration is passed as a YARN configuration setting during job submission, it effectively becomes part of the YARN protocol itself just like other YARN configuration settings. Frameworks already have to handle configs.

        Show
        aw Allen Wittenauer added a comment - I'm a little disturbed by how the architecture is pretty much dedicated to talking about directories rather than configuration objects. I think that's sort of building an architecture for a predetermined solution rather than actually thinking about the problem at hand. Let's say, that an admin level solution is picked. The code could copy from HADOOP_CONF_DIR/docker (which would be hard-coded) on each node into the localized directory then just pass --config to the docker commands. This eliminates the need for a pretty useless configuration entry. It also means that in the future once it gets fixed to actually do merges, the only code change is going to be the merge. Puts the burden on all frameworks that leverage YARN to supply the client configuration. I don't think this is necessarily true. If the client configuration is passed as a YARN configuration setting during job submission, it effectively becomes part of the YARN protocol itself just like other YARN configuration settings. Frameworks already have to handle configs.
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        Thanks for the feedback, Allen Wittenauer. I agree with the above, but could you help me understand why you believe that hard coding HADOOP_CONF_DIR/docker is more appropriate than allowing the administrator to specify where the docker client config dir lives?

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - Thanks for the feedback, Allen Wittenauer . I agree with the above, but could you help me understand why you believe that hard coding HADOOP_CONF_DIR/docker is more appropriate than allowing the administrator to specify where the docker client config dir lives?
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        Any additional comments here?

        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - Any additional comments here?
        Hide
        aw Allen Wittenauer added a comment -

        Nope. Given that probably no one is going to use YARN as a replacement for k8s or Mesos or whatever, it doesn't really matter.

        Show
        aw Allen Wittenauer added a comment - Nope. Given that probably no one is going to use YARN as a replacement for k8s or Mesos or whatever, it doesn't really matter.
        Hide
        tangzhankun Zhankun Tang added a comment -

        Do you mean that the community won't be interested on Docker/YARN in the future?

        Show
        tangzhankun Zhankun Tang added a comment - Do you mean that the community won't be interested on Docker/YARN in the future?
        Hide
        aw Allen Wittenauer added a comment -

        It's already not particularly interested. Mesos and Kubernetes have way too much momentum and mind share for anyone to deal with YARN for general Docker management.

        Show
        aw Allen Wittenauer added a comment - It's already not particularly interested. Mesos and Kubernetes have way too much momentum and mind share for anyone to deal with YARN for general Docker management.
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Zhankun Tang, there is an enthusiastic community working on the Docker / YARN pieces, we welcome your contributions and feedback. Tx.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Zhankun Tang , there is an enthusiastic community working on the Docker / YARN pieces, we welcome your contributions and feedback. Tx.
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Shane Kumpf, Varun Vasudev, Zhankun Tang,

        I read through the attached design doc. My thoughts follow.


        We already have an existing model on how to deal with artifact localization:

        A client (end-user clients or ApplicationMasters) ask NodeManagers to launch containers

        As part of launching containers, they specify two things in order for the NodeManager to download the requisite artifacts (java API LocalResource):

        1. The list of artifacts - files, jars, tar-balls etc each qualified as a URI
          • The URI dictates to the NodeManager where the artifact is and the schema informs how to get to that path
          • NodeManager today only understands Hadoop FileSystem specific URIs.
        2. In addition to the URI, a bunch of credentials that the user himself/herself passes along to the NodeManager saying "here, use these credentials to talk to the remote storage, representing me, and download the files on the local box. By the way, keep the credentials safe."

        Why can't we extend this mechanism from the API perspective? Kind of your approach (4.2) in the doc.

        1. Extend the URL notion to include docker paths
        2. Pass the login credentials also along similar to our Container credentials today. Obviously, this needs more plumbing to be able to pass along docker login credentials.
        3. Other non-secretive parameters can be passed along as another configuration file in distributed-cache. This obviously creates another need for yet-another storage besides docker, but we already need this to be able to kickstart the AM, spread around more config files etc.

        Your argument about every framework needing to do some work is valid. But almost all of today's frameworks already have user-defined configuration properties to specify additional distributed-cache files without changing code everytime.


        And finally, today, NodeManager doesn't have a notion of a default artifacts store. Every app specifies where its files are. We should try to keep this. If not, we should add a first class notion of default artifacts store both for docker and non-docker containers.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Shane Kumpf , Varun Vasudev , Zhankun Tang , I read through the attached design doc. My thoughts follow. We already have an existing model on how to deal with artifact localization: A client (end-user clients or ApplicationMasters) ask NodeManagers to launch containers As part of launching containers, they specify two things in order for the NodeManager to download the requisite artifacts (java API LocalResource ): The list of artifacts - files, jars, tar-balls etc each qualified as a URI The URI dictates to the NodeManager where the artifact is and the schema informs how to get to that path NodeManager today only understands Hadoop FileSystem specific URIs. In addition to the URI, a bunch of credentials that the user himself/herself passes along to the NodeManager saying "here, use these credentials to talk to the remote storage, representing me, and download the files on the local box. By the way, keep the credentials safe." Why can't we extend this mechanism from the API perspective? Kind of your approach (4.2) in the doc. Extend the URL notion to include docker paths Pass the login credentials also along similar to our Container credentials today. Obviously, this needs more plumbing to be able to pass along docker login credentials. Other non-secretive parameters can be passed along as another configuration file in distributed-cache. This obviously creates another need for yet-another storage besides docker, but we already need this to be able to kickstart the AM, spread around more config files etc. Your argument about every framework needing to do some work is valid. But almost all of today's frameworks already have user-defined configuration properties to specify additional distributed-cache files without changing code everytime. And finally, today, NodeManager doesn't have a notion of a default artifacts store . Every app specifies where its files are. We should try to keep this. If not, we should add a first class notion of default artifacts store both for docker and non-docker containers.

          People

          • Assignee:
            shanekumpf@gmail.com Shane Kumpf
            Reporter:
            shanekumpf@gmail.com Shane Kumpf
          • Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

            • Due:
              Created:
              Updated:

              Development