Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.8.0
    • 2.9.0, 3.0.0-alpha2
    • fairscheduler
    • None
    • Reviewed

    Description

      A number of issues have been reported with respect to preemption in FairScheduler along the lines of:

      1. FairScheduler preempts resources from nodes even if the resultant free resources cannot fit the incoming request.
      2. Preemption doesn't preempt from sibling queues
      3. Preemption doesn't preempt from sibling apps under the same queue that is over its fairshare
      4. ...

      Filing this umbrella JIRA to group all the issues together and think of a comprehensive solution.

      Attachments

        1. yarn-6076-branch-2.1.patch
          182 kB
          Karthik Kambatla
        2. yarn-4752-1.patch
          164 kB
          Karthik Kambatla
        3. YARN-4752.FairSchedulerPreemptionOverhaul.pdf
          59 kB
          Karthik Kambatla
        4. yarn-4752.4.patch
          181 kB
          Karthik Kambatla
        5. yarn-4752.4.patch
          181 kB
          Karthik Kambatla
        6. yarn-4752.3.patch
          182 kB
          Karthik Kambatla
        7. yarn-4752.2.patch
          167 kB
          Karthik Kambatla

        Issue Links

        Activity

          YARN-6215 adds a lock, and on the surface it appears would tackle this too. In any case, should we discuss this on a new JIRA so we can address any follow up work there?

          kasha Karthik Kambatla added a comment - YARN-6215 adds a lock, and on the surface it appears would tackle this too. In any case, should we discuss this on a new JIRA so we can address any follow up work there?
          zsl2007 zhangshilong added a comment -

          Karthik Kambatla I found one problem.
          In FSLeafQueue: I think resourceUsage of app should not be changed in assignContainer because FairShareComparator uses resourceUsage to sort Apps.

          private TreeSet<FSAppAttempt> fetchAppsWithDemand() {
              TreeSet<FSAppAttempt> pendingForResourceApps =
                  new TreeSet<>(policy.getComparator());
              readLock.lock();
              try {
                for (FSAppAttempt app : runnableApps) {
                  Resource pending = app.getAppAttemptResourceUsage().getPending();
                  if (!pending.equals(none())) {
                    pendingForResourceApps.add(app);
                  }
                }
              } finally {
                readLock.unlock();
              }
              return pendingForResourceApps;
            }
          

          But In FSPreemptionThread run->preemptContainers->app.trackContainerForPreemption
          preemptedResources of app will be changed without FairScheduler Lock.
          So getResourceUsage of App will be changed in function: assignContainer in FSLeafQueue.

          @Override
            public Resource getResourceUsage() {
              /*
               * getResourcesToPreempt() returns zero, except when there are containers
               * to preempt. Avoid creating an object in the common case.
               */
              return getPreemptedResources().equals(Resources.none())
                  ? getCurrentConsumption()
                  : Resources.subtract(getCurrentConsumption(), getPreemptedResources());
            }
          
          zsl2007 zhangshilong added a comment - Karthik Kambatla I found one problem. In FSLeafQueue: I think resourceUsage of app should not be changed in assignContainer because FairShareComparator uses resourceUsage to sort Apps. private TreeSet<FSAppAttempt> fetchAppsWithDemand() { TreeSet<FSAppAttempt> pendingForResourceApps = new TreeSet<>(policy.getComparator()); readLock.lock(); try { for (FSAppAttempt app : runnableApps) { Resource pending = app.getAppAttemptResourceUsage().getPending(); if (!pending.equals(none())) { pendingForResourceApps.add(app); } } } finally { readLock.unlock(); } return pendingForResourceApps; } But In FSPreemptionThread run->preemptContainers->app.trackContainerForPreemption preemptedResources of app will be changed without FairScheduler Lock. So getResourceUsage of App will be changed in function: assignContainer in FSLeafQueue. @Override public Resource getResourceUsage() { /* * getResourcesToPreempt() returns zero, except when there are containers * to preempt. Avoid creating an object in the common case . */ return getPreemptedResources().equals(Resources.none()) ? getCurrentConsumption() : Resources.subtract(getCurrentConsumption(), getPreemptedResources()); }

          Attaching the branch-2 patch that was committed, Jenkins run on YARN-6076.

          kasha Karthik Kambatla added a comment - Attaching the branch-2 patch that was committed, Jenkins run on YARN-6076 .

          I have moved all pending subtasks to YARN-5990 and closing this for better release notes. Thanks for the nudge, Andrew Wang.

          kasha Karthik Kambatla added a comment - I have moved all pending subtasks to YARN-5990 and closing this for better release notes. Thanks for the nudge, Andrew Wang .
          andrew.wang Andrew Wang added a comment -

          Daniel Templeton Looks like this umbrella was committed to trunk, should we resolve / set the fix version?

          andrew.wang Andrew Wang added a comment - Daniel Templeton Looks like this umbrella was committed to trunk, should we resolve / set the fix version?
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10885 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10885/)
          YARN-4752. Improved preemption in FairScheduler. (kasha) (kasha: rev 10468529a9b858bd945e7ecb063c9c1438efa474)

          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSParentQueue.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSLeafQueue.java
          • (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestQueueManagerRealScheduler.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
          • (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSStarvedApps.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
          • (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerWithMockPreemption.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestSchedulingPolicy.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSSchedulerNode.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/resource/Resources.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueue.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/Schedulable.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
          • (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSContext.java
          • (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSPreemptionThread.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerPreemption.java
          • (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSAppStarvation.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FakeSchedulable.java
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10885 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10885/ ) YARN-4752 . Improved preemption in FairScheduler. (kasha) (kasha: rev 10468529a9b858bd945e7ecb063c9c1438efa474) (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSParentQueue.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSLeafQueue.java (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestQueueManagerRealScheduler.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSStarvedApps.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerWithMockPreemption.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestSchedulingPolicy.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSSchedulerNode.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/resource/Resources.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueue.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/Schedulable.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSContext.java (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSPreemptionThread.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerPreemption.java (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSAppStarvation.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FakeSchedulable.java
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 18s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 8 new or modified test files.
          0 mvndep 0m 9s Maven dependency ordering for branch
          +1 mvninstall 6m 47s trunk passed
          +1 compile 4m 57s trunk passed
          +1 checkstyle 0m 48s trunk passed
          +1 mvnsite 1m 20s trunk passed
          +1 mvneclipse 0m 40s trunk passed
          +1 findbugs 2m 4s trunk passed
          +1 javadoc 1m 1s trunk passed
          0 mvndep 0m 10s Maven dependency ordering for patch
          +1 mvninstall 0m 58s the patch passed
          +1 compile 4m 37s the patch passed
          +1 javac 4m 37s the patch passed
          -0 checkstyle 0m 45s hadoop-yarn-project/hadoop-yarn: The patch generated 13 new + 173 unchanged - 145 fixed = 186 total (was 318)
          +1 mvnsite 1m 20s the patch passed
          +1 mvneclipse 0m 39s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 2m 20s the patch passed
          +1 javadoc 0m 33s hadoop-yarn-common in the patch passed.
          +1 javadoc 0m 25s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 926 unchanged - 9 fixed = 926 total (was 935)
          +1 unit 2m 23s hadoop-yarn-common in the patch passed.
          +1 unit 43m 32s hadoop-yarn-server-resourcemanager in the patch passed.
          +1 asflicense 0m 29s The patch does not generate ASF License warnings.
          84m 28s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue YARN-4752
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12839312/yarn-4752.4.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux a6a540ff52a7 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / aab9737
          Default Java 1.8.0_111
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13954/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13954/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13954/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 18s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 8 new or modified test files. 0 mvndep 0m 9s Maven dependency ordering for branch +1 mvninstall 6m 47s trunk passed +1 compile 4m 57s trunk passed +1 checkstyle 0m 48s trunk passed +1 mvnsite 1m 20s trunk passed +1 mvneclipse 0m 40s trunk passed +1 findbugs 2m 4s trunk passed +1 javadoc 1m 1s trunk passed 0 mvndep 0m 10s Maven dependency ordering for patch +1 mvninstall 0m 58s the patch passed +1 compile 4m 37s the patch passed +1 javac 4m 37s the patch passed -0 checkstyle 0m 45s hadoop-yarn-project/hadoop-yarn: The patch generated 13 new + 173 unchanged - 145 fixed = 186 total (was 318) +1 mvnsite 1m 20s the patch passed +1 mvneclipse 0m 39s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 20s the patch passed +1 javadoc 0m 33s hadoop-yarn-common in the patch passed. +1 javadoc 0m 25s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 926 unchanged - 9 fixed = 926 total (was 935) +1 unit 2m 23s hadoop-yarn-common in the patch passed. +1 unit 43m 32s hadoop-yarn-server-resourcemanager in the patch passed. +1 asflicense 0m 29s The patch does not generate ASF License warnings. 84m 28s Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue YARN-4752 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12839312/yarn-4752.4.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux a6a540ff52a7 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / aab9737 Default Java 1.8.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13954/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13954/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn Console output https://builds.apache.org/job/PreCommit-YARN-Build/13954/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 0s Docker mode activated.
          -1 docker 4m 14s Docker failed to build yetus/hadoop:a9ad5d6.



          This message was automatically generated.

          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. -1 docker 4m 14s Docker failed to build yetus/hadoop:a9ad5d6. Subsystem Report/Notes JIRA Issue YARN-4752 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12839290/yarn-4752.4.patch Console output https://builds.apache.org/job/PreCommit-YARN-Build/13952/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.

          Updated cumulative patch after YARN-5885 got committed. This patch (v4) is the first candidate for merge to trunk.

          kasha Karthik Kambatla added a comment - Updated cumulative patch after YARN-5885 got committed. This patch (v4) is the first candidate for merge to trunk.
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 16s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 8 new or modified test files.
          0 mvndep 0m 10s Maven dependency ordering for branch
          +1 mvninstall 6m 42s trunk passed
          +1 compile 5m 4s trunk passed
          +1 checkstyle 0m 48s trunk passed
          +1 mvnsite 1m 20s trunk passed
          +1 mvneclipse 0m 41s trunk passed
          +1 findbugs 2m 6s trunk passed
          +1 javadoc 1m 1s trunk passed
          0 mvndep 0m 10s Maven dependency ordering for patch
          +1 mvninstall 1m 0s the patch passed
          +1 compile 4m 40s the patch passed
          +1 javac 4m 40s the patch passed
          -0 checkstyle 0m 45s hadoop-yarn-project/hadoop-yarn: The patch generated 15 new + 180 unchanged - 140 fixed = 195 total (was 320)
          +1 mvnsite 1m 17s the patch passed
          +1 mvneclipse 0m 39s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 2m 20s the patch passed
          +1 javadoc 0m 32s hadoop-yarn-common in the patch passed.
          +1 javadoc 0m 26s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 926 unchanged - 9 fixed = 926 total (was 935)
          +1 unit 2m 23s hadoop-yarn-common in the patch passed.
          -1 unit 38m 2s hadoop-yarn-server-resourcemanager in the patch failed.
          +1 asflicense 0m 29s The patch does not generate ASF License warnings.
          79m 4s



          Reason Tests
          Failed junit tests hadoop.yarn.server.resourcemanager.TestTokenClientRMService



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue YARN-4752
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12839013/yarn-4752.3.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux ec9116424412 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 3219b7b
          Default Java 1.8.0_101
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13925/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13925/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13925/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13925/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 16s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 8 new or modified test files. 0 mvndep 0m 10s Maven dependency ordering for branch +1 mvninstall 6m 42s trunk passed +1 compile 5m 4s trunk passed +1 checkstyle 0m 48s trunk passed +1 mvnsite 1m 20s trunk passed +1 mvneclipse 0m 41s trunk passed +1 findbugs 2m 6s trunk passed +1 javadoc 1m 1s trunk passed 0 mvndep 0m 10s Maven dependency ordering for patch +1 mvninstall 1m 0s the patch passed +1 compile 4m 40s the patch passed +1 javac 4m 40s the patch passed -0 checkstyle 0m 45s hadoop-yarn-project/hadoop-yarn: The patch generated 15 new + 180 unchanged - 140 fixed = 195 total (was 320) +1 mvnsite 1m 17s the patch passed +1 mvneclipse 0m 39s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 20s the patch passed +1 javadoc 0m 32s hadoop-yarn-common in the patch passed. +1 javadoc 0m 26s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 926 unchanged - 9 fixed = 926 total (was 935) +1 unit 2m 23s hadoop-yarn-common in the patch passed. -1 unit 38m 2s hadoop-yarn-server-resourcemanager in the patch failed. +1 asflicense 0m 29s The patch does not generate ASF License warnings. 79m 4s Reason Tests Failed junit tests hadoop.yarn.server.resourcemanager.TestTokenClientRMService Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue YARN-4752 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12839013/yarn-4752.3.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux ec9116424412 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 3219b7b Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13925/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13925/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13925/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn Console output https://builds.apache.org/job/PreCommit-YARN-Build/13925/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 18s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 7 new or modified test files.
          0 mvndep 0m 51s Maven dependency ordering for branch
          +1 mvninstall 6m 44s trunk passed
          -1 compile 5m 19s hadoop-yarn in trunk failed.
          +1 checkstyle 0m 52s trunk passed
          +1 mvnsite 1m 27s trunk passed
          +1 mvneclipse 0m 49s trunk passed
          +1 findbugs 2m 13s trunk passed
          +1 javadoc 1m 7s trunk passed
          0 mvndep 0m 10s Maven dependency ordering for patch
          +1 mvninstall 1m 0s the patch passed
          -1 compile 3m 53s hadoop-yarn in the patch failed.
          -1 javac 3m 53s hadoop-yarn in the patch failed.
          -0 checkstyle 0m 49s hadoop-yarn-project/hadoop-yarn: The patch generated 11 new + 183 unchanged - 137 fixed = 194 total (was 320)
          +1 mvnsite 1m 25s the patch passed
          +1 mvneclipse 0m 47s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 2m 28s the patch passed
          +1 javadoc 0m 36s hadoop-yarn-common in the patch passed.
          +1 javadoc 0m 30s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 916 unchanged - 9 fixed = 916 total (was 925)
          +1 unit 2m 27s hadoop-yarn-common in the patch passed.
          +1 unit 37m 18s hadoop-yarn-server-resourcemanager in the patch passed.
          +1 asflicense 0m 37s The patch does not generate ASF License warnings.
          80m 10s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:e809691
          JIRA Issue YARN-4752
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12838473/yarn-4752.2.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 4d71dc1e74cc 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 93eeb13
          Default Java 1.8.0_101
          compile https://builds.apache.org/job/PreCommit-YARN-Build/13866/artifact/patchprocess/branch-compile-hadoop-yarn-project_hadoop-yarn.txt
          findbugs v3.0.0
          compile https://builds.apache.org/job/PreCommit-YARN-Build/13866/artifact/patchprocess/patch-compile-hadoop-yarn-project_hadoop-yarn.txt
          javac https://builds.apache.org/job/PreCommit-YARN-Build/13866/artifact/patchprocess/patch-compile-hadoop-yarn-project_hadoop-yarn.txt
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13866/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13866/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13866/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 18s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 7 new or modified test files. 0 mvndep 0m 51s Maven dependency ordering for branch +1 mvninstall 6m 44s trunk passed -1 compile 5m 19s hadoop-yarn in trunk failed. +1 checkstyle 0m 52s trunk passed +1 mvnsite 1m 27s trunk passed +1 mvneclipse 0m 49s trunk passed +1 findbugs 2m 13s trunk passed +1 javadoc 1m 7s trunk passed 0 mvndep 0m 10s Maven dependency ordering for patch +1 mvninstall 1m 0s the patch passed -1 compile 3m 53s hadoop-yarn in the patch failed. -1 javac 3m 53s hadoop-yarn in the patch failed. -0 checkstyle 0m 49s hadoop-yarn-project/hadoop-yarn: The patch generated 11 new + 183 unchanged - 137 fixed = 194 total (was 320) +1 mvnsite 1m 25s the patch passed +1 mvneclipse 0m 47s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 28s the patch passed +1 javadoc 0m 36s hadoop-yarn-common in the patch passed. +1 javadoc 0m 30s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 916 unchanged - 9 fixed = 916 total (was 925) +1 unit 2m 27s hadoop-yarn-common in the patch passed. +1 unit 37m 18s hadoop-yarn-server-resourcemanager in the patch passed. +1 asflicense 0m 37s The patch does not generate ASF License warnings. 80m 10s Subsystem Report/Notes Docker Image:yetus/hadoop:e809691 JIRA Issue YARN-4752 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12838473/yarn-4752.2.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 4d71dc1e74cc 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 93eeb13 Default Java 1.8.0_101 compile https://builds.apache.org/job/PreCommit-YARN-Build/13866/artifact/patchprocess/branch-compile-hadoop-yarn-project_hadoop-yarn.txt findbugs v3.0.0 compile https://builds.apache.org/job/PreCommit-YARN-Build/13866/artifact/patchprocess/patch-compile-hadoop-yarn-project_hadoop-yarn.txt javac https://builds.apache.org/job/PreCommit-YARN-Build/13866/artifact/patchprocess/patch-compile-hadoop-yarn-project_hadoop-yarn.txt checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13866/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13866/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn Console output https://builds.apache.org/job/PreCommit-YARN-Build/13866/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.

          Attaching the preliminary cumulative patch for the curious.

          kasha Karthik Kambatla added a comment - Attaching the preliminary cumulative patch for the curious.

          Working on a branch might help keep the reviews here simple. Creating a branch at YARN-4752. The branch follows RTC.

          kasha Karthik Kambatla added a comment - Working on a branch might help keep the reviews here simple. Creating a branch at YARN-4752 . The branch follows RTC.
          asuresh Arun Suresh added a comment -

          Thanks for collating and aggregating all the outstanding issue Karthik Kambatla..

          With respect to your doc, I was wondering if it is really necessary to have a PriorityQueue of apps based on how starved they are. Two apps A1 and A2 might be starved over their fair share by different amounts, but it is arguably not MORE 'fair' to pick the the app with the larger deficit than probably pickling one at random... Ordering them on starvation time seems better (or maybe track the number of times each starved app is passed over in a pre-emption run and bubble up apps will longer preemptions passes)

          Also, I don't think the 'demand' is calculated accurately in the FSAppAttempt. It looks like the updateDemand() method in FSAppAttempt actually adds all RRs in a priority. This means that some requests will be counted 3 times (node + rack + ANY) and will show up as overly starved compared to requests for just ANY resource.

          asuresh Arun Suresh added a comment - Thanks for collating and aggregating all the outstanding issue Karthik Kambatla .. With respect to your doc, I was wondering if it is really necessary to have a PriorityQueue of apps based on how starved they are. Two apps A1 and A2 might be starved over their fair share by different amounts, but it is arguably not MORE 'fair' to pick the the app with the larger deficit than probably pickling one at random... Ordering them on starvation time seems better (or maybe track the number of times each starved app is passed over in a pre-emption run and bubble up apps will longer preemptions passes) Also, I don't think the 'demand' is calculated accurately in the FSAppAttempt. It looks like the updateDemand() method in FSAppAttempt actually adds all RRs in a priority. This means that some requests will be counted 3 times (node + rack + ANY) and will show up as overly starved compared to requests for just ANY resource.

          Attached is a summary of the current issues and an outline on one approach to solve the issues.

          Arun Suresh, Ashwin Shankar, Zhihai Xu - would like to hear your thoughts on the approach. Thanks.

          kasha Karthik Kambatla added a comment - Attached is a summary of the current issues and an outline on one approach to solve the issues. Arun Suresh , Ashwin Shankar , Zhihai Xu - would like to hear your thoughts on the approach. Thanks.

          Yes, my current intent is to split scheduling and preemption, as you can see in my initial patch posted on the github link above.

          kasha Karthik Kambatla added a comment - Yes, my current intent is to split scheduling and preemption, as you can see in my initial patch posted on the github link above.
          xinxianyin Xianyin Xin added a comment -

          hi Karthik Kambatla, can we consider to decouple the scheduling and preemption using the same {{getResourceUsage() }} as it block YARN-4120 and YARN-4090 which aims to improve the scheduling throughput? As discussed in YARN-4120, see, e.g, https://issues.apache.org/jira/browse/YARN-4120?focusedCommentId=14733235&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14733235.

          Thanks.

          xinxianyin Xianyin Xin added a comment - hi Karthik Kambatla , can we consider to decouple the scheduling and preemption using the same {{getResourceUsage() }} as it block YARN-4120 and YARN-4090 which aims to improve the scheduling throughput? As discussed in YARN-4120 , see, e.g, https://issues.apache.org/jira/browse/YARN-4120?focusedCommentId=14733235&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14733235 . Thanks.

          Update: Started looking at each of the subtasks here to understand the problems. Working on putting together a doc to discuss the design. Towards that, have been toying with a prototype implementation - https://github.com/kambatla/hadoop/commits/fs-preemption

          kasha Karthik Kambatla added a comment - Update: Started looking at each of the subtasks here to understand the problems. Working on putting together a doc to discuss the design. Towards that, have been toying with a prototype implementation - https://github.com/kambatla/hadoop/commits/fs-preemption

          People

            kasha Karthik Kambatla
            kasha Karthik Kambatla
            Votes:
            1 Vote for this issue
            Watchers:
            Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack