Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5936

when cpu strict mode is closed, yarn couldn't assure scheduling fairness between containers

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 2.7.1
    • Fix Version/s: None
    • Component/s: nodemanager
    • Labels:
      None
    • Environment:

      CentOS7.1

    • Target Version/s:

      Description

      When using LinuxContainer, the setting that "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage" is true could assure scheduling fairness with the cpu bandwith of cgroup. But the cpu bandwidth of cgroup would lead to bad performance in our experience.
      Without cpu bandwidth of cgroup, cpu.share of cgroup is our only way to assure scheduling fairness, but it is not completely effective. For example, There are two container that have same vcore(means same cpu.share), one container is single-threaded, the other container is multi-thread. the multi-thread will have more CPU time, It's unreasonable!
      Here is my test case, I submit two distributedshell application. And two commmand are below:

      hadoop jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar org.apache.hadoop.yarn.applications.distributedshell.Client -jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar -shell_script ./run.sh  -shell_args 10 -num_containers 1 -container_memory 1024 -container_vcores 1 -master_memory 1024 -master_vcores 1 -priority 10
      
      hadoop jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar org.apache.hadoop.yarn.applications.distributedshell.Client -jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar -shell_script ./run.sh  -shell_args 1  -num_containers 1 -container_memory 1024 -container_vcores 1 -master_memory 1024 -master_vcores 1 -priority 10
      

      here show the cpu time of the two container:

        PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
      15448 yarn      20   0 9059592  28336   9180 S 998.7  0.1  24:09.30 java
      15026 yarn      20   0 9050340  27480   9188 S 100.0  0.1   3:33.97 java
      13767 yarn      20   0 1799816 381208  18528 S   4.6  1.2   0:30.55 java
         77 root      rt   0       0      0      0 S   0.3  0.0   0:00.74 migration/1   
      

      We find the cpu time of Muliti-Thread are ten times than the cpu time of Single-Thread, though the two container have same cpu.share.

      notes:
      run.sh

       
      	java -cp /home/yarn/loop.jar:$CLASSPATH loop.loop $1	
      

      loop.java

       
      package loop;
      public class loop {
      	public static void main(String[] args) {
      		// TODO Auto-generated method stub
      		int loop = 1;
      		if(args.length>=1) {
      			System.out.println(args[0]);
      			loop = Integer.parseInt(args[0]);
      		}
      		for(int i=0;i<loop;i++){
      			System.out.println("start thread " + i);
      			new Thread(new Runnable() {
      				@Override
      				public void run() {
      					// TODO Auto-generated method stub
      					int j=0;
      					while(true){j++;}
      				}
      			}).start();
      		}
      	}
      }
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                zhengchenyu zhengchenyu
              • Votes:
                0 Vote for this issue
                Watchers:
                14 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - 1m
                  1m
                  Remaining:
                  Remaining Estimate - 1m
                  1m
                  Logged:
                  Time Spent - Not Specified
                  Not Specified