Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-2440

Cgroups should allow YARN containers to be limited to allocated cores

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.6.0
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      The current cgroups implementation does not limit YARN containers to the cores allocated in yarn-site.xml.

      1. apache-yarn-2440.0.patch
        7 kB
        Varun Vasudev
      2. apache-yarn-2440.1.patch
        8 kB
        Varun Vasudev
      3. apache-yarn-2440.2.patch
        24 kB
        Varun Vasudev
      4. apache-yarn-2440.3.patch
        22 kB
        Varun Vasudev
      5. apache-yarn-2440.4.patch
        23 kB
        Varun Vasudev
      6. apache-yarn-2440.5.patch
        21 kB
        Varun Vasudev
      7. apache-yarn-2440.6.patch
        22 kB
        Varun Vasudev
      8. screenshot-current-implementation.jpg
        356 kB
        Varun Vasudev

        Activity

        Hide
        vvasudev Varun Vasudev added a comment -

        Screenshot with the CPU usage in the current implementation. In my yarn-site.xml, I had set yarn.nodemanager.resource.cpu-vcores to 2. The python script is taking up as many cores as it can.

        The quota for the yarn group was set to -1.

        varun@ubuntu:/var/hadoop/hadoop-3.0.0-SNAPSHOT$ cat /cgroup/cpu/yarn/cpu.cfs_quota_us
        -1

        Show
        vvasudev Varun Vasudev added a comment - Screenshot with the CPU usage in the current implementation. In my yarn-site.xml, I had set yarn.nodemanager.resource.cpu-vcores to 2. The python script is taking up as many cores as it can. The quota for the yarn group was set to -1. varun@ubuntu:/var/hadoop/hadoop-3.0.0-SNAPSHOT$ cat /cgroup/cpu/yarn/cpu.cfs_quota_us -1
        Hide
        vvasudev Varun Vasudev added a comment -

        Attached patch with fix.

        Show
        vvasudev Varun Vasudev added a comment - Attached patch with fix.
        Hide
        vvasudev Varun Vasudev added a comment -

        After applying the patch, the quota is set correctly.

        varun@ubuntu:/var/hadoop/hadoop-3.0.0-SNAPSHOT$ cat /cgroup/cpu/yarn/cpu.cfs_quota_us
        200000
        
        Show
        vvasudev Varun Vasudev added a comment - After applying the patch, the quota is set correctly. varun@ubuntu:/var/hadoop/hadoop-3.0.0-SNAPSHOT$ cat /cgroup/cpu/yarn/cpu.cfs_quota_us 200000
        Hide
        nroberts Nathan Roberts added a comment -

        Thanks Varun for the patch. I'm wondering if it would be possible to make this configurable at the system level and per-app. For example, I'd like an application to be able to specify that it wants to run with strict container limits (to verify SLA's for example), but in general I don't want these limits in place (why not let a container use additional CPU if it's available?).

        Show
        nroberts Nathan Roberts added a comment - Thanks Varun for the patch. I'm wondering if it would be possible to make this configurable at the system level and per-app. For example, I'd like an application to be able to specify that it wants to run with strict container limits (to verify SLA's for example), but in general I don't want these limits in place (why not let a container use additional CPU if it's available?).
        Hide
        vvasudev Varun Vasudev added a comment -

        Nathan Roberts there's already a ticket for your request - YARN-810. That's next on my todo list. I've left a comment there asking if I can take it over.

        Show
        vvasudev Varun Vasudev added a comment - Nathan Roberts there's already a ticket for your request - YARN-810 . That's next on my todo list. I've left a comment there asking if I can take it over.
        Hide
        ywskycn Wei Yan added a comment -

        Varun Vasudev, for general cases, we shouldn't strictly limit the cfs_quota_us. We always want to let co-located containers to share the cpu resource in a proportional way, not strictly follow the container_vcores/NM_vcores ratio. We have one runnable patch in YARN-810. I'll check with Sandy for the reviewing.

        Show
        ywskycn Wei Yan added a comment - Varun Vasudev , for general cases, we shouldn't strictly limit the cfs_quota_us. We always want to let co-located containers to share the cpu resource in a proportional way, not strictly follow the container_vcores/NM_vcores ratio. We have one runnable patch in YARN-810 . I'll check with Sandy for the reviewing.
        Hide
        vvasudev Varun Vasudev added a comment -

        Wei Yan this patch doesn't limit containers to container_vcores/NM_vcores ratio. What it does do is limit the overall YARN usage to the yarn.nodemanager.resource.cpu-vcores. If you have 4 cores on a machine and set yarn.nodemanager.resource.cpu-vcores 2, we don't restrict the YARN containers to 2 cores. The containers can create threads and use up as many cores as they want, which defeats the purpose of setting yarn.nodemanager.resource.cpu-vcores.

        Show
        vvasudev Varun Vasudev added a comment - Wei Yan this patch doesn't limit containers to container_vcores/NM_vcores ratio. What it does do is limit the overall YARN usage to the yarn.nodemanager.resource.cpu-vcores. If you have 4 cores on a machine and set yarn.nodemanager.resource.cpu-vcores 2, we don't restrict the YARN containers to 2 cores. The containers can create threads and use up as many cores as they want, which defeats the purpose of setting yarn.nodemanager.resource.cpu-vcores.
        Hide
        ywskycn Wei Yan added a comment -

        Varun Vasudev, I misunderstood this jira. Will post comment later.

        Show
        ywskycn Wei Yan added a comment - Varun Vasudev , I misunderstood this jira. Will post comment later.
        Hide
        jlowe Jason Lowe added a comment -

        I think cfs_quota_us has a maximum value of 1000000, so we may have an issue if vcores>10.

        I don't see how this takes into account the mapping of vcores to actual CPUs. It's not safe to assume 1 vcore == 1 physical CPU, as some sites will map multiple vcores to a physical core to allow fractions of a physical CPU to be allocated or to account for varying CPU performance across a heterogeneous cluster.

        Show
        jlowe Jason Lowe added a comment - I think cfs_quota_us has a maximum value of 1000000, so we may have an issue if vcores>10. I don't see how this takes into account the mapping of vcores to actual CPUs. It's not safe to assume 1 vcore == 1 physical CPU, as some sites will map multiple vcores to a physical core to allow fractions of a physical CPU to be allocated or to account for varying CPU performance across a heterogeneous cluster.
        Hide
        vvasudev Varun Vasudev added a comment -

        Jason Lowe does it make sense to get the number of physical cores on the machine and derive the vcore to physical cpu ratio?

        Show
        vvasudev Varun Vasudev added a comment - Jason Lowe does it make sense to get the number of physical cores on the machine and derive the vcore to physical cpu ratio?
        Hide
        vvasudev Varun Vasudev added a comment -

        I'll update the patch to limit cfs_quota_us.

        Show
        vvasudev Varun Vasudev added a comment - I'll update the patch to limit cfs_quota_us.
        Hide
        jlowe Jason Lowe added a comment -

        does it make sense to get the number of physical cores on the machine and derive the vcore to physical cpu ratio?

        Only if the user can specify the multiplier between a vcore and a physical CPU. Not all physical CPUs are created equal, and as I mentioned earlier, some sites will want to allow fractions of a physical CPU to be allocated. Otherwise we're limiting the number of containers to the number of physical cores, and not all tasks require a full core.

        Show
        jlowe Jason Lowe added a comment - does it make sense to get the number of physical cores on the machine and derive the vcore to physical cpu ratio? Only if the user can specify the multiplier between a vcore and a physical CPU. Not all physical CPUs are created equal, and as I mentioned earlier, some sites will want to allow fractions of a physical CPU to be allocated. Otherwise we're limiting the number of containers to the number of physical cores, and not all tasks require a full core.
        Hide
        vvasudev Varun Vasudev added a comment -

        There used to be a variable for that ratio but it was removed in YARN-782.

        Show
        vvasudev Varun Vasudev added a comment - There used to be a variable for that ratio but it was removed in YARN-782 .
        Hide
        jlowe Jason Lowe added a comment -

        Interesting. Sandy Ryza could you comment? I'm wondering how we're going to support automatically setting a node's vcore value based on the node's physical characteristics without some kind of property to specify how to convert from physical core to vcore.

        Show
        jlowe Jason Lowe added a comment - Interesting. Sandy Ryza could you comment? I'm wondering how we're going to support automatically setting a node's vcore value based on the node's physical characteristics without some kind of property to specify how to convert from physical core to vcore.
        Hide
        vvasudev Varun Vasudev added a comment -

        Uploaded a new patch to address the issue raised by Jason Lowe on the max value of cfs_quota_us. I'll upload further versions once there's clarity on vcore to physical core mapping.

        Show
        vvasudev Varun Vasudev added a comment - Uploaded a new patch to address the issue raised by Jason Lowe on the max value of cfs_quota_us. I'll upload further versions once there's clarity on vcore to physical core mapping.
        Hide
        sandyr Sandy Ryza added a comment -

        We removed it because it wasn't consistent with the vmem-pmem-ratio and was an unnecessary layer of indirection. If automatically configuring a node's vcore resource based on its physical characteristics is a goal, I wouldn't be opposed to adding something back in.

        For the purposes of this JIRA, might it be simpler to express a config in terms of the % of the node's CPU power that YARN gets?

        Show
        sandyr Sandy Ryza added a comment - We removed it because it wasn't consistent with the vmem-pmem-ratio and was an unnecessary layer of indirection. If automatically configuring a node's vcore resource based on its physical characteristics is a goal, I wouldn't be opposed to adding something back in. For the purposes of this JIRA, might it be simpler to express a config in terms of the % of the node's CPU power that YARN gets?
        Hide
        vvasudev Varun Vasudev added a comment -

        It might make things easier to go with Sandy Ryza idea to add a configuration to add a config which expresses a % of node's CPU that is used by YARN. Jason Lowe would that address your concerns?

        Show
        vvasudev Varun Vasudev added a comment - It might make things easier to go with Sandy Ryza idea to add a configuration to add a config which expresses a % of node's CPU that is used by YARN. Jason Lowe would that address your concerns?
        Hide
        jlowe Jason Lowe added a comment -

        Sure for this JIRA we can go with a percent of total CPU to limit YARN. For something like YARN-160 we'd need the user to specify some kind of relationship between vcores and physical cores.

        Show
        jlowe Jason Lowe added a comment - Sure for this JIRA we can go with a percent of total CPU to limit YARN. For something like YARN-160 we'd need the user to specify some kind of relationship between vcores and physical cores.
        Hide
        hadoopqa Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12663704/apache-yarn-2440.1.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4694//testReport/
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4694//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663704/apache-yarn-2440.1.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4694//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4694//console This message is automatically generated.
        Hide
        vvasudev Varun Vasudev added a comment -

        Update title to reflect what this patch is doing.

        Show
        vvasudev Varun Vasudev added a comment - Update title to reflect what this patch is doing.
        Hide
        vvasudev Varun Vasudev added a comment -

        Uploaded a new patch. It adds two new parameters to the YARN configuration.
        1. yarn.nodemanager.containers-cpu-cores - can be set to the number of cores admins want YARN containers to use.
        2. yarn.nodemanager.containers-cpu-percentage - can be set a percentage of overall CPU that admins want YARN containers to use.

        By default, we use all cores.

        I've also updated the description of yarn.nodemanager.resource.cpu-vcores to what I think is a more accurate version.

        Show
        vvasudev Varun Vasudev added a comment - Uploaded a new patch. It adds two new parameters to the YARN configuration. 1. yarn.nodemanager.containers-cpu-cores - can be set to the number of cores admins want YARN containers to use. 2. yarn.nodemanager.containers-cpu-percentage - can be set a percentage of overall CPU that admins want YARN containers to use. By default, we use all cores. I've also updated the description of yarn.nodemanager.resource.cpu-vcores to what I think is a more accurate version.
        Hide
        hadoopqa Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12664090/apache-yarn-2440.2.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4714//testReport/
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4714//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664090/apache-yarn-2440.2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4714//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4714//console This message is automatically generated.
        Hide
        jlowe Jason Lowe added a comment -

        Thanks for updating the patch, Varun.

        I don't see why we need both a containers-cpu-cores and containers-cpu-percentage, and I think it leads to confusion when both exist. At first I did not realize that one overrode the other. Instead I assumed that if you set cpu-cores to X and cpu-percentage to Y then you were requesting Y% of X cores. Then there's the additional question of whether container usage is pinned to those cores, etc. Only having cpu-percentage is a simpler model that still allows the user to specify cores indirectly (e.g.: 25% of an 8 core system is 2 cores). Maybe I'm missing the use case where we really need containers-cpu-cores and the confusing (to me at least) override behavior between the two properties.

        Other comments on the patch:

        • I'm not thrilled about the name template "containers-cpu-*" since it could easily be misinterpreted as a per-container thing as well, but I'm currently at a loss for a better prefix. Suggestions welcome.
        • Does getOverallLimits need to check for a quotaUS that's too low as well?
        • I think minimally we need to log a warning if we're going to ignore setting up cgroups to limit CPU usage across all containers if the user specified to do so.
        • Related to the previous comment, I think it would be nice if we didn't try to setup any limits if none were specified. That way if there's some issue with correctly determining the number of cores on a particular system it can still work in the default, "use everything" scenario.
        • NodeManagerHardwareUtils.getContainerCores should be getContainersCores (the per-container vs. all-containers confusion again)
        Show
        jlowe Jason Lowe added a comment - Thanks for updating the patch, Varun. I don't see why we need both a containers-cpu-cores and containers-cpu-percentage, and I think it leads to confusion when both exist. At first I did not realize that one overrode the other. Instead I assumed that if you set cpu-cores to X and cpu-percentage to Y then you were requesting Y% of X cores. Then there's the additional question of whether container usage is pinned to those cores, etc. Only having cpu-percentage is a simpler model that still allows the user to specify cores indirectly (e.g.: 25% of an 8 core system is 2 cores). Maybe I'm missing the use case where we really need containers-cpu-cores and the confusing (to me at least) override behavior between the two properties. Other comments on the patch: I'm not thrilled about the name template "containers-cpu-*" since it could easily be misinterpreted as a per-container thing as well, but I'm currently at a loss for a better prefix. Suggestions welcome. Does getOverallLimits need to check for a quotaUS that's too low as well? I think minimally we need to log a warning if we're going to ignore setting up cgroups to limit CPU usage across all containers if the user specified to do so. Related to the previous comment, I think it would be nice if we didn't try to setup any limits if none were specified. That way if there's some issue with correctly determining the number of cores on a particular system it can still work in the default, "use everything" scenario. NodeManagerHardwareUtils.getContainerCores should be getContainersCores (the per-container vs. all-containers confusion again)
        Hide
        sjlee0 Sangjin Lee added a comment -

        It might be good to use several fairly representative scenarios and see how we can satisfy them with clear configuration.

        One scenario I can see pretty common is this (just for illustration):

        • 8-core system
        • want to use only 6 cores for containers (reserving 2 for NM and DN, etc.)
        • want to allocate 1/2 core per container by default

        IMO, the simplest config is

        yarn.nodemanager.resource.cpu-vcores = 60
        yarn.nodemanager.containers-cores-to-vcores = 10
        each container asks 5 vcores

        Or I could have

        yarn.nodemanager.resource.cpu-vcores = 60
        yarn.nodemanager.containers-cpu-cores = 6 (core-to-vcore ratio understood as the ratio of these two)
        each container asks 5 vcores

        I'm not sure how I can use containers-cpu-percentage to describe this scenario...

        Does this help? Are there other types of use cases that we should review this with?

        Show
        sjlee0 Sangjin Lee added a comment - It might be good to use several fairly representative scenarios and see how we can satisfy them with clear configuration. One scenario I can see pretty common is this (just for illustration): 8-core system want to use only 6 cores for containers (reserving 2 for NM and DN, etc.) want to allocate 1/2 core per container by default IMO, the simplest config is yarn.nodemanager.resource.cpu-vcores = 60 yarn.nodemanager.containers-cores-to-vcores = 10 each container asks 5 vcores Or I could have yarn.nodemanager.resource.cpu-vcores = 60 yarn.nodemanager.containers-cpu-cores = 6 (core-to-vcore ratio understood as the ratio of these two) each container asks 5 vcores I'm not sure how I can use containers-cpu-percentage to describe this scenario... Does this help? Are there other types of use cases that we should review this with?
        Hide
        beckham007 Beckham007 added a comment -

        Hi, all
        Why not use the cpuset subsystem of cgroups?
        The cpuset could make container to run on allocated cores, and reserving some cores for system.

        Show
        beckham007 Beckham007 added a comment - Hi, all Why not use the cpuset subsystem of cgroups? The cpuset could make container to run on allocated cores, and reserving some cores for system.
        Hide
        vvasudev Varun Vasudev added a comment -

        Jason Lowe the example provided by Sangjin Lee is the one I wanted to address when I added support for both percentage and absolute cores. Would it make more sense if I picked the lower value instead of one overriding the other. Something like -

        1. Evaluate cores allocated by yarn.nodemanager.containers-cpu-cores and yarn.nodemanager.containers-cpu-percentage.
        2. Pick the lower of the two values
        3. Log a warning/info message that both were specified and that we're picking the lower value.

        I'm not thrilled about the name template "containers-cpu-*" since it could easily be misinterpreted as a per-container thing as well, but I'm currently at a loss for a better prefix. Suggestions welcome.

        How about yarn.nodemanager.all-containers-cpu-cores and yarn.nodemanager.all-containers-cpu-percentage?

        Does getOverallLimits need to check for a quotaUS that's too low as well?

        Thanks for catching this; I'll fix it in the next patch.

        I think minimally we need to log a warning if we're going to ignore setting up cgroups to limit CPU usage across all containers if the user specified to do so.

        I'll add in the logging message.

        Related to the previous comment, I think it would be nice if we didn't try to setup any limits if none were specified. That way if there's some issue with correctly determining the number of cores on a particular system it can still work in the default, "use everything" scenario.

        Will do.

        NodeManagerHardwareUtils.getContainerCores should be getContainersCores (the per-container vs. all-containers confusion again)

        I'll rename the function.

        Show
        vvasudev Varun Vasudev added a comment - Jason Lowe the example provided by Sangjin Lee is the one I wanted to address when I added support for both percentage and absolute cores. Would it make more sense if I picked the lower value instead of one overriding the other. Something like - 1. Evaluate cores allocated by yarn.nodemanager.containers-cpu-cores and yarn.nodemanager.containers-cpu-percentage. 2. Pick the lower of the two values 3. Log a warning/info message that both were specified and that we're picking the lower value. I'm not thrilled about the name template "containers-cpu-*" since it could easily be misinterpreted as a per-container thing as well, but I'm currently at a loss for a better prefix. Suggestions welcome. How about yarn.nodemanager.all-containers-cpu-cores and yarn.nodemanager.all-containers-cpu-percentage? Does getOverallLimits need to check for a quotaUS that's too low as well? Thanks for catching this; I'll fix it in the next patch. I think minimally we need to log a warning if we're going to ignore setting up cgroups to limit CPU usage across all containers if the user specified to do so. I'll add in the logging message. Related to the previous comment, I think it would be nice if we didn't try to setup any limits if none were specified. That way if there's some issue with correctly determining the number of cores on a particular system it can still work in the default, "use everything" scenario. Will do. NodeManagerHardwareUtils.getContainerCores should be getContainersCores (the per-container vs. all-containers confusion again) I'll rename the function.
        Hide
        vvasudev Varun Vasudev added a comment -

        Beckham007 the current implementation of Cgroups uses cpu instead of cpuset, probably due to the flexibility offered(sharing the cores is handled by the kernel). Is there any particular benefit to cpuset?

        Show
        vvasudev Varun Vasudev added a comment - Beckham007 the current implementation of Cgroups uses cpu instead of cpuset, probably due to the flexibility offered(sharing the cores is handled by the kernel). Is there any particular benefit to cpuset?
        Hide
        beckham007 Beckham007 added a comment -

        hi, Varun Vasudev “Cpusets provide a mechanism for assigning a set of CPUs and Memory
        Nodes to a set of tasks.” https://www.kernel.org/doc/Documentation/cgroups/cpusets.txt
        For a NM has 24 pcores, we can use cpuset subsystem to make hadoop-yarn use cpu core 0-21, and left the others((22,23) for system. And then using cpu.shares to share the pcore 0-21. What's more, we can assign a pcore(such as core 21) to run a long-running container, and other containers only share pcore 0-20.

        Show
        beckham007 Beckham007 added a comment - hi, Varun Vasudev “Cpusets provide a mechanism for assigning a set of CPUs and Memory Nodes to a set of tasks.” https://www.kernel.org/doc/Documentation/cgroups/cpusets.txt For a NM has 24 pcores, we can use cpuset subsystem to make hadoop-yarn use cpu core 0-21, and left the others((22,23) for system. And then using cpu.shares to share the pcore 0-21. What's more, we can assign a pcore(such as core 21) to run a long-running container, and other containers only share pcore 0-20.
        Hide
        beckham007 Beckham007 added a comment -

        In addition, mesos uses cpuset for default. https://github.com/apache/mesos/blob/master/src/linux/cgroups.cpp

        Show
        beckham007 Beckham007 added a comment - In addition, mesos uses cpuset for default. https://github.com/apache/mesos/blob/master/src/linux/cgroups.cpp
        Hide
        jlowe Jason Lowe added a comment -

        For the case presented by Sangjin Lee the user has an 8 core system and wants to use at most 6 cores for YARN containers. That can be done by simply setting containers-cpu-percentage to 75. I don't see why we need a separate containers-cpu-cores parameter here, and I think it causes more problems than it solves per my previous comment. If we only want to support whole-core granularity then I can see containers-cpu-cores as a better choice, but otherwise containers-cpu-percentage is more flexible.

        Also I don't see vcores being relevant for this JIRA. The way vcores map to physical cores is node-dependent, but apps ask for vcores in a node-independent fashion. IIUC this JIRA is focused on simply limiting the amount of CPU all YARN containers on the node can possibly use in aggregate. Changing the vcore-to-core ratio on the node will change how many containers the node might run simultaneously, but it shouldn't impact how much of the physical CPU the user wants reserved for non-container processes.

        On a related note, it's interesting to step back and see if this is really what most users will want in practice. If the intent is to ensure the NM, DN, and other system processes get enough CPU time then I think a better approach is to put those system processes in a peer cgroup to the YARN containers cgroup and set their relative CPU shares accordingly. Then YARN containers can continue to use any spare CPU if desired (i.e.: no CPU "fragmentation") but the system processes are guaranteed not to be starved out by the YARN containers. Some users may want a hard limit and hence why this feature would be useful for them, but I suspect most users will not want to leave spare CPU lying around when containers need it.

        How about yarn.nodemanager.all-containers-cpu-cores and yarn.nodemanager.all-containers-cpu-percentage?

        I'm indifferent on adding "all" as a prefix. Something like yarn.nodemanager.containers-limit-cpu-percentage might be more clear that this is a hard limit and CPUs can go idle even if containers are demanding more from the machine than this limit.

        Show
        jlowe Jason Lowe added a comment - For the case presented by Sangjin Lee the user has an 8 core system and wants to use at most 6 cores for YARN containers. That can be done by simply setting containers-cpu-percentage to 75. I don't see why we need a separate containers-cpu-cores parameter here, and I think it causes more problems than it solves per my previous comment. If we only want to support whole-core granularity then I can see containers-cpu-cores as a better choice, but otherwise containers-cpu-percentage is more flexible. Also I don't see vcores being relevant for this JIRA. The way vcores map to physical cores is node-dependent, but apps ask for vcores in a node-independent fashion. IIUC this JIRA is focused on simply limiting the amount of CPU all YARN containers on the node can possibly use in aggregate. Changing the vcore-to-core ratio on the node will change how many containers the node might run simultaneously, but it shouldn't impact how much of the physical CPU the user wants reserved for non-container processes. On a related note, it's interesting to step back and see if this is really what most users will want in practice. If the intent is to ensure the NM, DN, and other system processes get enough CPU time then I think a better approach is to put those system processes in a peer cgroup to the YARN containers cgroup and set their relative CPU shares accordingly. Then YARN containers can continue to use any spare CPU if desired (i.e.: no CPU "fragmentation") but the system processes are guaranteed not to be starved out by the YARN containers. Some users may want a hard limit and hence why this feature would be useful for them, but I suspect most users will not want to leave spare CPU lying around when containers need it. How about yarn.nodemanager.all-containers-cpu-cores and yarn.nodemanager.all-containers-cpu-percentage? I'm indifferent on adding "all" as a prefix. Something like yarn.nodemanager.containers-limit-cpu-percentage might be more clear that this is a hard limit and CPUs can go idle even if containers are demanding more from the machine than this limit.
        Hide
        vvasudev Varun Vasudev added a comment -

        Uploaded new patch which addresses Jason Lowe comments.
        1. Dropped yarn.nodemanager.containers-cpu-cores, only percentage can be specified.
        2. Renamed yarn.nodemanager.containers-cpu-percentage to yarn.nodemanger.containers-limit-cpu-percentage
        3. Added log message if we are unable to use Cgroups because the values for period or quota are too low.
        4. Added check for minimum value of quota
        5. Don't set any period or quota settings if all CPUs are being used.
        6. Renamed getContainerCores to getContainersCores

        Show
        vvasudev Varun Vasudev added a comment - Uploaded new patch which addresses Jason Lowe comments. 1. Dropped yarn.nodemanager.containers-cpu-cores, only percentage can be specified. 2. Renamed yarn.nodemanager.containers-cpu-percentage to yarn.nodemanger.containers-limit-cpu-percentage 3. Added log message if we are unable to use Cgroups because the values for period or quota are too low. 4. Added check for minimum value of quota 5. Don't set any period or quota settings if all CPUs are being used. 6. Renamed getContainerCores to getContainersCores
        Hide
        hadoopqa Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12664596/apache-yarn-2440.3.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4739//testReport/
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4739//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664596/apache-yarn-2440.3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4739//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4739//console This message is automatically generated.
        Hide
        vvasudev Varun Vasudev added a comment -

        Uploaded a new patch that missed use case where usage was reset to 100% after being lowered.

        Show
        vvasudev Varun Vasudev added a comment - Uploaded a new patch that missed use case where usage was reset to 100% after being lowered.
        Hide
        hadoopqa Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12664653/apache-yarn-2440.4.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4742//testReport/
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4742//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664653/apache-yarn-2440.4.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4742//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4742//console This message is automatically generated.
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Just caught up with the discussion. I can get behind an absolute limit too. Specifically in the context of heterogeneous clusters where uniform % configurations can go really bad where the only resort will then be to do per-node configuration - not ideal. Would that be a valid use-case for putting in the absolute limit? Jason Lowe? Even if it were, I am okay punting that off to a separate JIRA.

        Comments on the patch:

        • containers-limit-cpu-percentage -> yarn.nodemanager.resource.percentage-cpu-limit to be consistent? Similarly NM_CONTAINERS_CPU_PERC? I don't like the tag 'resource', it should have been 'resources' but it is what it is.
        • You still have refs to YarnConfiguration.NM_CONTAINERS_CPU_ABSOLUTE in the patch. Similarly the javadoc in NodeManagerHardwareUtils needs to be updated if we are not adding the absolute cpu config. It should no longer refer to "number of cores that should be used for YARN containers"
        • TestCgroupsLCEResourcesHandler: You can use mockito if you only want to override num-processors in TestResourceCalculatorPlugin. Similarly in TestNodeManagerHardwareUtils.
        • The tests may fail on a machine with > 4 cores?
        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Just caught up with the discussion. I can get behind an absolute limit too. Specifically in the context of heterogeneous clusters where uniform % configurations can go really bad where the only resort will then be to do per-node configuration - not ideal. Would that be a valid use-case for putting in the absolute limit? Jason Lowe ? Even if it were, I am okay punting that off to a separate JIRA. Comments on the patch: containers-limit-cpu-percentage -> yarn.nodemanager.resource.percentage-cpu-limit to be consistent? Similarly NM_CONTAINERS_CPU_PERC? I don't like the tag 'resource', it should have been 'resources' but it is what it is. You still have refs to YarnConfiguration.NM_CONTAINERS_CPU_ABSOLUTE in the patch. Similarly the javadoc in NodeManagerHardwareUtils needs to be updated if we are not adding the absolute cpu config. It should no longer refer to "number of cores that should be used for YARN containers" TestCgroupsLCEResourcesHandler: You can use mockito if you only want to override num-processors in TestResourceCalculatorPlugin. Similarly in TestNodeManagerHardwareUtils. The tests may fail on a machine with > 4 cores?
        Hide
        jlowe Jason Lowe added a comment -

        Specifically in the context of heterogeneous clusters where uniform % configurations can go really bad where the only resort will then be to do per-node configuration - not ideal.

        Yes, I could see the heterogenous cluster being a case where specifying absolute instead of relative may be desirable. My biggest concern is that it's confusing when trying to combine the absolute and relative concepts – it's not obvious if one overrides the other or if one is relative to the other.

        Part of my concern to keep this as simple as possible and the configuration burden to an absolute minimum is that I'm missing the real-world use case. As I mentioned before, I think most users would rather not use the functionality proposed by this JIRA but instead setup peer cgroups for other systems and set their relative cgroup shares appropriately. With this JIRA the CPUs could sit idle despite demand from YARN containers, while a peer cgroup setup allows CPU guarantees without idle CPUs if the demand is there.

        Show
        jlowe Jason Lowe added a comment - Specifically in the context of heterogeneous clusters where uniform % configurations can go really bad where the only resort will then be to do per-node configuration - not ideal. Yes, I could see the heterogenous cluster being a case where specifying absolute instead of relative may be desirable. My biggest concern is that it's confusing when trying to combine the absolute and relative concepts – it's not obvious if one overrides the other or if one is relative to the other. Part of my concern to keep this as simple as possible and the configuration burden to an absolute minimum is that I'm missing the real-world use case. As I mentioned before, I think most users would rather not use the functionality proposed by this JIRA but instead setup peer cgroups for other systems and set their relative cgroup shares appropriately. With this JIRA the CPUs could sit idle despite demand from YARN containers, while a peer cgroup setup allows CPU guarantees without idle CPUs if the demand is there.
        Hide
        vvasudev Varun Vasudev added a comment -

        Uploaded new patch to address Vinod's concerns.

        containers-limit-cpu-percentage -> yarn.nodemanager.resource.percentage-cpu-limit to be consistent? Similarly NM_CONTAINERS_CPU_PERC? I don't like the tag 'resource', it should have been 'resources' but it is what it is.

        I'm worried that calling it that will lead users to think it's a percentage of the vcores that they specify. In the patch I've changed it to yarn.nodemanager.resource.percentage-physical-cpu-limit but if you or Jason feel strongly about it, I can change it to yarn.nodemanager.resource.percentage-cpu-limit.

        You still have refs to YarnConfiguration.NM_CONTAINERS_CPU_ABSOLUTE in the patch. Similarly the javadoc in NodeManagerHardwareUtils needs to be updated if we are not adding the absolute cpu config. It should no longer refer to "number of cores that should be used for YARN containers"

        Fixed.

        TestCgroupsLCEResourcesHandler: You can use mockito if you only want to override num-processors in TestResourceCalculatorPlugin. Similarly in TestNodeManagerHardwareUtils.

        Switched to mockito.

        The tests may fail on a machine with > 4 cores?

        Don't think so. The tests mock the getNumProcessors() function so we should be fine.

        Show
        vvasudev Varun Vasudev added a comment - Uploaded new patch to address Vinod's concerns. containers-limit-cpu-percentage -> yarn.nodemanager.resource.percentage-cpu-limit to be consistent? Similarly NM_CONTAINERS_CPU_PERC? I don't like the tag 'resource', it should have been 'resources' but it is what it is. I'm worried that calling it that will lead users to think it's a percentage of the vcores that they specify. In the patch I've changed it to yarn.nodemanager.resource.percentage-physical-cpu-limit but if you or Jason feel strongly about it, I can change it to yarn.nodemanager.resource.percentage-cpu-limit. You still have refs to YarnConfiguration.NM_CONTAINERS_CPU_ABSOLUTE in the patch. Similarly the javadoc in NodeManagerHardwareUtils needs to be updated if we are not adding the absolute cpu config. It should no longer refer to "number of cores that should be used for YARN containers" Fixed. TestCgroupsLCEResourcesHandler: You can use mockito if you only want to override num-processors in TestResourceCalculatorPlugin. Similarly in TestNodeManagerHardwareUtils. Switched to mockito. The tests may fail on a machine with > 4 cores? Don't think so. The tests mock the getNumProcessors() function so we should be fine.
        Hide
        hadoopqa Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12667461/apache-yarn-2440.5.patch
        against trunk revision 2749fc6.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4860//testReport/
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4860//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12667461/apache-yarn-2440.5.patch against trunk revision 2749fc6. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4860//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4860//console This message is automatically generated.
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        This looks close. Some more comments

           <property>
          +    <description>Percentage of CPU that can be allocated
          +    for containers. This setting allows users to limit the number of
          +    physical cores that YARN containers use. Currently functional only
          +    on Linux using cgroups. The default is to use 100% of CPU.
          +    </description>
          +    <name>yarn.nodemanager.resource.percentage-physical-cpu-limit</name>
          +    <value>100</value>
          +  </property>
        

        "the number of physical cores" part isn't really right. It actually is 75% across all cores, for e.g. We have this sort of "number of physical cores" description in multiple places, let's fix that? For instance, in NodeManagerHardwareUtils, yarn-default.xml etc.

        Also,

        • NM_CONTAINERS_CPU_PERC -> NM_RESOURCE_PHYSICAL_CPU_LIMIT
        • Similarly rename DEFAULT_NM_CONTAINERS_CPU_PERC
        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - This looks close. Some more comments <property> + <description>Percentage of CPU that can be allocated + for containers. This setting allows users to limit the number of + physical cores that YARN containers use. Currently functional only + on Linux using cgroups. The default is to use 100% of CPU. + </description> + <name>yarn.nodemanager.resource.percentage-physical-cpu-limit</name> + <value>100</value> + </property> "the number of physical cores" part isn't really right. It actually is 75% across all cores, for e.g. We have this sort of "number of physical cores" description in multiple places, let's fix that? For instance, in NodeManagerHardwareUtils, yarn-default.xml etc. Also, NM_CONTAINERS_CPU_PERC -> NM_RESOURCE_PHYSICAL_CPU_LIMIT Similarly rename DEFAULT_NM_CONTAINERS_CPU_PERC
        Hide
        vvasudev Varun Vasudev added a comment -

        Uploaded new patch to address Vinod's comments.

           <property>
          +    <description>Percentage of CPU that can be allocated
          +    for containers. This setting allows users to limit the number of
          +    physical cores that YARN containers use. Currently functional only
          +    on Linux using cgroups. The default is to use 100% of CPU.
          +    </description>
          +    <name>yarn.nodemanager.resource.percentage-physical-cpu-limit</name>
          +    <value>100</value>
          +  </property>
        

        "the number of physical cores" part isn't really right. It actually is 75% across all cores, for e.g. We have this sort of "number of physical cores" description in multiple places, let's fix that? For instance, in NodeManagerHardwareUtils, yarn-default.xml etc.

        Fixed.

        Also,
        NM_CONTAINERS_CPU_PERC -> NM_RESOURCE_PHYSICAL_CPU_LIMIT
        Similarly rename DEFAULT_NM_CONTAINERS_CPU_PERC

        Done, I'd prefer to have percentage as part of the name. I've changed it to NM_RESOURCE_PERCENTAGE_PHYSICAL_CPU_LIMIT and DEFAULT_NM_RESOURCE_PERCENTAGE_PHYSICAL_CPU_LIMIT.

        Show
        vvasudev Varun Vasudev added a comment - Uploaded new patch to address Vinod's comments. <property> + <description>Percentage of CPU that can be allocated + for containers. This setting allows users to limit the number of + physical cores that YARN containers use. Currently functional only + on Linux using cgroups. The default is to use 100% of CPU. + </description> + <name>yarn.nodemanager.resource.percentage-physical-cpu-limit</name> + <value>100</value> + </property> "the number of physical cores" part isn't really right. It actually is 75% across all cores, for e.g. We have this sort of "number of physical cores" description in multiple places, let's fix that? For instance, in NodeManagerHardwareUtils, yarn-default.xml etc. Fixed. Also, NM_CONTAINERS_CPU_PERC -> NM_RESOURCE_PHYSICAL_CPU_LIMIT Similarly rename DEFAULT_NM_CONTAINERS_CPU_PERC Done, I'd prefer to have percentage as part of the name. I've changed it to NM_RESOURCE_PERCENTAGE_PHYSICAL_CPU_LIMIT and DEFAULT_NM_RESOURCE_PERCENTAGE_PHYSICAL_CPU_LIMIT.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12667797/apache-yarn-2440.6.patch
        against trunk revision b67d5ba.

        -1 patch. Trunk compilation may be broken.

        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4871//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12667797/apache-yarn-2440.6.patch against trunk revision b67d5ba. -1 patch . Trunk compilation may be broken. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4871//console This message is automatically generated.
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment - - edited

        As I mentioned before, I think most users would rather not use the functionality proposed by this JIRA but instead setup peer cgroups for other systems and set their relative cgroup shares appropriately. With this JIRA the CPUs could sit idle despite demand from YARN containers, while a peer cgroup setup allows CPU guarantees without idle CPUs if the demand is there.

        Jason Lowe, agree with the general philosophy. Though we are not yet there in practice - datanodes, region servers don't yet live in cgroups in many sites. Looking back at this JIRA, I see a good use for this. Having the overall YARN limit will help ensure that apps' containers don't thrash cpu once we start enabling cgroups support.

        The other dimension to this is determinism w.r.t performance. Limiting to allocated cores overall (as well as per container later) helps orgs run workloads and reason about them deterministically. One of the examples is benchmarking apps, but deterministic execution is a desired option beyond benchmarks too.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - - edited As I mentioned before, I think most users would rather not use the functionality proposed by this JIRA but instead setup peer cgroups for other systems and set their relative cgroup shares appropriately. With this JIRA the CPUs could sit idle despite demand from YARN containers, while a peer cgroup setup allows CPU guarantees without idle CPUs if the demand is there. Jason Lowe , agree with the general philosophy. Though we are not yet there in practice - datanodes, region servers don't yet live in cgroups in many sites. Looking back at this JIRA, I see a good use for this. Having the overall YARN limit will help ensure that apps' containers don't thrash cpu once we start enabling cgroups support. The other dimension to this is determinism w.r.t performance. Limiting to allocated cores overall (as well as per container later) helps orgs run workloads and reason about them deterministically. One of the examples is benchmarking apps, but deterministic execution is a desired option beyond benchmarks too.
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        The build machine ran into an issue which Giridharan Kesavan helped fixing on my offline request. Rekicked the build manually..

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - The build machine ran into an issue which Giridharan Kesavan helped fixing on my offline request. Rekicked the build manually..
        Hide
        hadoopqa Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12667797/apache-yarn-2440.6.patch
        against trunk revision cbfe263.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4875//testReport/
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4875//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12667797/apache-yarn-2440.6.patch against trunk revision cbfe263. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4875//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4875//console This message is automatically generated.
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        This looks good, +1. Checking this in..

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - This looks good, +1. Checking this in..
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Committed this to trunk and branch-2. Thanks Varun!

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Committed this to trunk and branch-2. Thanks Varun!
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Yarn-trunk #677 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/677/)
        YARN-2440. Enabled Nodemanagers to limit the aggregate cpu usage across all containers to a preconfigured limit. Contributed by Varun Vasudev. (vinodkv: rev 4be95175cdb58ff12a27ab443d609d3b46da7bfa)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestNodeManagerHardwareUtils.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestCgroupsLCEResourcesHandler.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/CgroupsLCEResourcesHandler.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/NodeManagerHardwareUtils.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk #677 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/677/ ) YARN-2440 . Enabled Nodemanagers to limit the aggregate cpu usage across all containers to a preconfigured limit. Contributed by Varun Vasudev. (vinodkv: rev 4be95175cdb58ff12a27ab443d609d3b46da7bfa) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestNodeManagerHardwareUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestCgroupsLCEResourcesHandler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/CgroupsLCEResourcesHandler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/NodeManagerHardwareUtils.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk #1893 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1893/)
        YARN-2440. Enabled Nodemanagers to limit the aggregate cpu usage across all containers to a preconfigured limit. Contributed by Varun Vasudev. (vinodkv: rev 4be95175cdb58ff12a27ab443d609d3b46da7bfa)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/CgroupsLCEResourcesHandler.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestCgroupsLCEResourcesHandler.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/NodeManagerHardwareUtils.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestNodeManagerHardwareUtils.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1893 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1893/ ) YARN-2440 . Enabled Nodemanagers to limit the aggregate cpu usage across all containers to a preconfigured limit. Contributed by Varun Vasudev. (vinodkv: rev 4be95175cdb58ff12a27ab443d609d3b46da7bfa) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/CgroupsLCEResourcesHandler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestCgroupsLCEResourcesHandler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/NodeManagerHardwareUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestNodeManagerHardwareUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Hdfs-trunk #1868 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1868/)
        YARN-2440. Enabled Nodemanagers to limit the aggregate cpu usage across all containers to a preconfigured limit. Contributed by Varun Vasudev. (vinodkv: rev 4be95175cdb58ff12a27ab443d609d3b46da7bfa)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/CgroupsLCEResourcesHandler.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/NodeManagerHardwareUtils.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestNodeManagerHardwareUtils.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestCgroupsLCEResourcesHandler.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Hdfs-trunk #1868 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1868/ ) YARN-2440 . Enabled Nodemanagers to limit the aggregate cpu usage across all containers to a preconfigured limit. Contributed by Varun Vasudev. (vinodkv: rev 4be95175cdb58ff12a27ab443d609d3b46da7bfa) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/CgroupsLCEResourcesHandler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/NodeManagerHardwareUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestNodeManagerHardwareUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestCgroupsLCEResourcesHandler.java

          People

          • Assignee:
            vvasudev Varun Vasudev
            Reporter:
            vvasudev Varun Vasudev
          • Votes:
            0 Vote for this issue
            Watchers:
            18 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development