Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-6194

Cluster capacity in SchedulingPolicy is updated only on allocation file reload

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.8.0
    • Fix Version/s: 2.9.0, 3.0.0-alpha4
    • Component/s: fairscheduler
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      Some of the SchedulingPolicy methods need cluster capacity which is set using #initialize today. However, initialize() is called only on allocation reload. If nodes are added between reloads, the cluster capacity is not considered until the next reload.

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user flyrain closed the pull request at:

          https://github.com/apache/hadoop/pull/196

          Show
          githubbot ASF GitHub Bot added a comment - Github user flyrain closed the pull request at: https://github.com/apache/hadoop/pull/196
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user flyrain commented on the issue:

          https://github.com/apache/hadoop/pull/196

          Committed

          Show
          githubbot ASF GitHub Bot added a comment - Github user flyrain commented on the issue: https://github.com/apache/hadoop/pull/196 Committed
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user flyrain reopened a pull request:

          https://github.com/apache/hadoop/pull/196

          YARN-6194. Cluster capacity in SchedulingPolicy is updated only on allocation file reload

          This patch passes the ClusterNodeTracker instead of ClusterResource into the DRF policy.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/flyrain/hadoop yarn-6194

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/hadoop/pull/196.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #196


          commit 78c6303b15d5531fc0bd22331a0720803ad5c416
          Author: Yufei Gu <yufei.gu@cloudera.com>
          Date: 2017-02-17T23:57:11Z

          YARN-6194. Cluster capacity in SchedulingPolicy is updated only on allocation file reload.

          commit 215d78f2d2ff997a5834fcc6beb903576bbc8d78
          Author: Yufei Gu <yufei.gu@cloudera.com>
          Date: 2017-02-22T19:40:57Z

          YARN-6194. Cluster capacity in SchedulingPolicy is updated only on allocation file reload.

          commit 96b3b42f70e0cc6e2fb55200eaa7d346191bc627
          Author: Yufei Gu <yufei.gu@cloudera.com>
          Date: 2017-02-22T20:40:08Z

          YARN-6194. Cluster capacity in SchedulingPolicy is updated only on allocation file reload.

          commit 339d76d8053b083672585f21e4f95b712df606ae
          Author: Yufei Gu <yufei.gu@cloudera.com>
          Date: 2017-02-22T21:24:19Z

          YARN-6194. Cluster capacity in SchedulingPolicy is updated only on allocation file reload.


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user flyrain reopened a pull request: https://github.com/apache/hadoop/pull/196 YARN-6194 . Cluster capacity in SchedulingPolicy is updated only on allocation file reload This patch passes the ClusterNodeTracker instead of ClusterResource into the DRF policy. You can merge this pull request into a Git repository by running: $ git pull https://github.com/flyrain/hadoop yarn-6194 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hadoop/pull/196.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #196 commit 78c6303b15d5531fc0bd22331a0720803ad5c416 Author: Yufei Gu <yufei.gu@cloudera.com> Date: 2017-02-17T23:57:11Z YARN-6194 . Cluster capacity in SchedulingPolicy is updated only on allocation file reload. commit 215d78f2d2ff997a5834fcc6beb903576bbc8d78 Author: Yufei Gu <yufei.gu@cloudera.com> Date: 2017-02-22T19:40:57Z YARN-6194 . Cluster capacity in SchedulingPolicy is updated only on allocation file reload. commit 96b3b42f70e0cc6e2fb55200eaa7d346191bc627 Author: Yufei Gu <yufei.gu@cloudera.com> Date: 2017-02-22T20:40:08Z YARN-6194 . Cluster capacity in SchedulingPolicy is updated only on allocation file reload. commit 339d76d8053b083672585f21e4f95b712df606ae Author: Yufei Gu <yufei.gu@cloudera.com> Date: 2017-02-22T21:24:19Z YARN-6194 . Cluster capacity in SchedulingPolicy is updated only on allocation file reload.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user flyrain commented on the issue:

          https://github.com/apache/hadoop/pull/196

          Committed

          Show
          githubbot ASF GitHub Bot added a comment - Github user flyrain commented on the issue: https://github.com/apache/hadoop/pull/196 Committed
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user flyrain closed the pull request at:

          https://github.com/apache/hadoop/pull/196

          Show
          githubbot ASF GitHub Bot added a comment - Github user flyrain closed the pull request at: https://github.com/apache/hadoop/pull/196
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11291 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11291/)
          YARN-6194. Cluster capacity in SchedulingPolicy is updated only on (kasha: rev b10e962224a8ae1c6031a05322b0cc5e564bd078)

          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/DominantResourceFairnessPolicy.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/TestDominantResourceFairnessPolicy.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSContext.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueue.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11291 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11291/ ) YARN-6194 . Cluster capacity in SchedulingPolicy is updated only on (kasha: rev b10e962224a8ae1c6031a05322b0cc5e564bd078) (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/DominantResourceFairnessPolicy.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/TestDominantResourceFairnessPolicy.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSContext.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueue.java
          Hide
          yufeigu Yufei Gu added a comment -

          Thanks Karthik Kambatla for the review and commit.

          Show
          yufeigu Yufei Gu added a comment - Thanks Karthik Kambatla for the review and commit.
          Hide
          kasha Karthik Kambatla added a comment -

          Just committed to trunk and branch-2. Thanks Yufei Gu for your contribution.

          Show
          kasha Karthik Kambatla added a comment - Just committed to trunk and branch-2. Thanks Yufei Gu for your contribution.
          Hide
          kasha Karthik Kambatla added a comment -

          +1, checking this in.

          Show
          kasha Karthik Kambatla added a comment - +1, checking this in.
          Hide
          yufeigu Yufei Gu added a comment -

          The test failures are unrelated based on my local testing.

          Show
          yufeigu Yufei Gu added a comment - The test failures are unrelated based on my local testing.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 21s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
          +1 mvninstall 16m 34s trunk passed
          +1 compile 0m 38s trunk passed
          +1 checkstyle 0m 31s trunk passed
          +1 mvnsite 0m 45s trunk passed
          +1 mvneclipse 0m 17s trunk passed
          +1 findbugs 1m 15s trunk passed
          +1 javadoc 0m 29s trunk passed
          +1 mvninstall 0m 39s the patch passed
          +1 compile 0m 39s the patch passed
          +1 javac 0m 39s the patch passed
          +1 checkstyle 0m 27s the patch passed
          +1 mvnsite 0m 37s the patch passed
          +1 mvneclipse 0m 15s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 42s the patch passed
          +1 javadoc 0m 25s the patch passed
          -1 unit 42m 8s hadoop-yarn-server-resourcemanager in the patch failed.
          +1 asflicense 0m 18s The patch does not generate ASF License warnings.
          69m 36s



          Reason Tests
          Failed junit tests hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption
            hadoop.yarn.server.resourcemanager.TestRMRestart



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue YARN-6194
          GITHUB PR https://github.com/apache/hadoop/pull/196
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 5f1457b0ec2c 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 1a6ca75
          Default Java 1.8.0_121
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-YARN-Build/15049/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/15049/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/15049/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 21s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 16m 34s trunk passed +1 compile 0m 38s trunk passed +1 checkstyle 0m 31s trunk passed +1 mvnsite 0m 45s trunk passed +1 mvneclipse 0m 17s trunk passed +1 findbugs 1m 15s trunk passed +1 javadoc 0m 29s trunk passed +1 mvninstall 0m 39s the patch passed +1 compile 0m 39s the patch passed +1 javac 0m 39s the patch passed +1 checkstyle 0m 27s the patch passed +1 mvnsite 0m 37s the patch passed +1 mvneclipse 0m 15s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 42s the patch passed +1 javadoc 0m 25s the patch passed -1 unit 42m 8s hadoop-yarn-server-resourcemanager in the patch failed. +1 asflicense 0m 18s The patch does not generate ASF License warnings. 69m 36s Reason Tests Failed junit tests hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption   hadoop.yarn.server.resourcemanager.TestRMRestart Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue YARN-6194 GITHUB PR https://github.com/apache/hadoop/pull/196 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 5f1457b0ec2c 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 1a6ca75 Default Java 1.8.0_121 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-YARN-Build/15049/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/15049/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/15049/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user kambatla commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/196#discussion_r102573113

          — Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java —
          @@ -91,7 +91,23 @@ public static SchedulingPolicy parse(String policy)
          }
          return getInstance(clazz);
          }

          • +
            + /**
            + * Initialize the scheduling policy with cluster resources. Deprecated since

              • End diff –

          Want to add a @deprecated tag in the javadoc and point to what to use?

          Show
          githubbot ASF GitHub Bot added a comment - Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/196#discussion_r102573113 — Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java — @@ -91,7 +91,23 @@ public static SchedulingPolicy parse(String policy) } return getInstance(clazz); } + + /** + * Initialize the scheduling policy with cluster resources. Deprecated since End diff – Want to add a @deprecated tag in the javadoc and point to what to use?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user kambatla commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/196#discussion_r102573406

          — Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java —
          @@ -91,7 +91,23 @@ public static SchedulingPolicy parse(String policy)
          }
          return getInstance(clazz);
          }

          • +
            + /**
            + * Initialize the scheduling policy with cluster resources. Deprecated since
            + * it doesn't track cluster resource changes.
            + *
            + * @param clusterCapacity cluster resources
            + */
            + @Deprecated
            + public void initialize(Resource clusterCapacity) {}
            +
            + /**
            + * Initialize the scheduling policy with a

            {@link FSContext}

            object which can

              • End diff –

          In the future, different policies could use different information from the FSContext. Maybe, instead of referring to it as the only thing, say something like "FSContext, which has a pointer to the cluster resources among other information.

          Show
          githubbot ASF GitHub Bot added a comment - Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/196#discussion_r102573406 — Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java — @@ -91,7 +91,23 @@ public static SchedulingPolicy parse(String policy) } return getInstance(clazz); } + + /** + * Initialize the scheduling policy with cluster resources. Deprecated since + * it doesn't track cluster resource changes. + * + * @param clusterCapacity cluster resources + */ + @Deprecated + public void initialize(Resource clusterCapacity) {} + + /** + * Initialize the scheduling policy with a {@link FSContext} object which can End diff – In the future, different policies could use different information from the FSContext. Maybe, instead of referring to it as the only thing, say something like "FSContext, which has a pointer to the cluster resources among other information.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user flyrain commented on the issue:

          https://github.com/apache/hadoop/pull/196

          Thanks Karthik for the review. Push a new commit for your comments.

          Show
          githubbot ASF GitHub Bot added a comment - Github user flyrain commented on the issue: https://github.com/apache/hadoop/pull/196 Thanks Karthik for the review. Push a new commit for your comments.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user kambatla commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/196#discussion_r102559501

          — Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java —
          @@ -92,7 +92,7 @@ public static SchedulingPolicy parse(String policy)
          return getInstance(clazz);
          }

          • public void initialize(Resource clusterCapacity) {}
            + public void initialize(FSContext fsContext) {}
              • End diff –

          Since this method is in an @Public class, let us add a new method and deprecate the old method.

          Let us also add javadoc for both methods.

          Show
          githubbot ASF GitHub Bot added a comment - Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/196#discussion_r102559501 — Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java — @@ -92,7 +92,7 @@ public static SchedulingPolicy parse(String policy) return getInstance(clazz); } public void initialize(Resource clusterCapacity) {} + public void initialize(FSContext fsContext) {} End diff – Since this method is in an @Public class, let us add a new method and deprecate the old method. Let us also add javadoc for both methods.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user kambatla commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/196#discussion_r102555178

          — Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/TestDominantResourceFairnessPolicy.java —
          @@ -40,7 +43,10 @@
          private Comparator<Schedulable> createComparator(int clusterMem,
          int clusterCpu) {
          DominantResourceFairnessPolicy policy = new DominantResourceFairnessPolicy();

          • policy.initialize(BuilderUtils.newResource(clusterMem, clusterCpu));
            + FSContext fsContext = mock(FSContext.class);
            + when(fsContext.getClusterResource()).
            + thenReturn(BuilderUtils.newResource(clusterMem, clusterCpu));
              • End diff –

          Let us use Resources.create instead of BuilterUtils.newInstance.

          Show
          githubbot ASF GitHub Bot added a comment - Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/196#discussion_r102555178 — Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/TestDominantResourceFairnessPolicy.java — @@ -40,7 +43,10 @@ private Comparator<Schedulable> createComparator(int clusterMem, int clusterCpu) { DominantResourceFairnessPolicy policy = new DominantResourceFairnessPolicy(); policy.initialize(BuilderUtils.newResource(clusterMem, clusterCpu)); + FSContext fsContext = mock(FSContext.class); + when(fsContext.getClusterResource()). + thenReturn(BuilderUtils.newResource(clusterMem, clusterCpu)); End diff – Let us use Resources.create instead of BuilterUtils.newInstance.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user kambatla commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/196#discussion_r102556457

          — Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java —
          @@ -3369,7 +3369,48 @@ public void testBasicDRFWithQueues() throws Exception

          { scheduler.handle(updateEvent); Assert.assertEquals(1, app2.getLiveContainers().size()); }
          • +
            + @Test
            + public void testDRFWithClusterResourceChanges() throws Exception {

              • End diff –

          Using a "real" scheduler with mock nodes seems excessive for this. Can we just mock the scheduler and context? Also, this might be a good test for TestDRF than TestFairScheduler.

          Show
          githubbot ASF GitHub Bot added a comment - Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/196#discussion_r102556457 — Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java — @@ -3369,7 +3369,48 @@ public void testBasicDRFWithQueues() throws Exception { scheduler.handle(updateEvent); Assert.assertEquals(1, app2.getLiveContainers().size()); } + + @Test + public void testDRFWithClusterResourceChanges() throws Exception { End diff – Using a "real" scheduler with mock nodes seems excessive for this. Can we just mock the scheduler and context? Also, this might be a good test for TestDRF than TestFairScheduler.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user kambatla commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/196#discussion_r102554581

          — Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSContext.java —
          @@ -27,28 +29,37 @@
          private boolean preemptionEnabled = false;
          private float preemptionUtilizationThreshold;
          private FSStarvedApps starvedApps;
          + private FairScheduler scheduler;
          +
          + FSContext(FairScheduler scheduler)

          { + this.scheduler = scheduler; + }
          • public boolean isPreemptionEnabled() {
            + boolean isPreemptionEnabled() { return preemptionEnabled; }
          • public void setPreemptionEnabled() {
            + void setPreemptionEnabled()
            Unknown macro: { this.preemptionEnabled = true; if (starvedApps == null) { starvedApps = new FSStarvedApps(); } }
          • public FSStarvedApps getStarvedApps() {
            + FSStarvedApps getStarvedApps() { return starvedApps; }
          • public float getPreemptionUtilizationThreshold() {
            + float getPreemptionUtilizationThreshold() { return preemptionUtilizationThreshold; }
          • public void setPreemptionUtilizationThreshold(
            + void setPreemptionUtilizationThreshold(
            float preemptionUtilizationThreshold) { this.preemptionUtilizationThreshold = preemptionUtilizationThreshold; }

            +
            + public Resource getClusterResource() {
            + return scheduler.getClusterResource();

              • End diff –

          This looks okay for now, but this allows a scheduling policy to modify the overall cluster's resources. Making a copy could be expensive, as this is called on every compare call.

          Show
          githubbot ASF GitHub Bot added a comment - Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/196#discussion_r102554581 — Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSContext.java — @@ -27,28 +29,37 @@ private boolean preemptionEnabled = false; private float preemptionUtilizationThreshold; private FSStarvedApps starvedApps; + private FairScheduler scheduler; + + FSContext(FairScheduler scheduler) { + this.scheduler = scheduler; + } public boolean isPreemptionEnabled() { + boolean isPreemptionEnabled() { return preemptionEnabled; } public void setPreemptionEnabled() { + void setPreemptionEnabled() Unknown macro: { this.preemptionEnabled = true; if (starvedApps == null) { starvedApps = new FSStarvedApps(); } } public FSStarvedApps getStarvedApps() { + FSStarvedApps getStarvedApps() { return starvedApps; } public float getPreemptionUtilizationThreshold() { + float getPreemptionUtilizationThreshold() { return preemptionUtilizationThreshold; } public void setPreemptionUtilizationThreshold( + void setPreemptionUtilizationThreshold( float preemptionUtilizationThreshold) { this.preemptionUtilizationThreshold = preemptionUtilizationThreshold; } + + public Resource getClusterResource() { + return scheduler.getClusterResource(); End diff – This looks okay for now, but this allows a scheduling policy to modify the overall cluster's resources. Making a copy could be expensive, as this is called on every compare call.
          Hide
          yufeigu Yufei Gu added a comment -

          Updated the PR. Using FSContext instead of ClusterNodeTracker.

          Show
          yufeigu Yufei Gu added a comment - Updated the PR. Using FSContext instead of ClusterNodeTracker.
          Hide
          yufeigu Yufei Gu added a comment -

          Created the PR. This is no TODO item in TestFairSchedulerPreemption.

          Show
          yufeigu Yufei Gu added a comment - Created the PR. This is no TODO item in TestFairSchedulerPreemption.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user flyrain opened a pull request:

          https://github.com/apache/hadoop/pull/196

          YARN-6194. Cluster capacity in SchedulingPolicy is updated only on allocation file reload

          This patch passes the ClusterNodeTracker instead of ClusterResource into the DRF policy.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/flyrain/hadoop yarn-6194

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/hadoop/pull/196.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #196


          commit 78c6303b15d5531fc0bd22331a0720803ad5c416
          Author: Yufei Gu <yufei.gu@cloudera.com>
          Date: 2017-02-17T23:57:11Z

          YARN-6194. Cluster capacity in SchedulingPolicy is updated only on allocation file reload.


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user flyrain opened a pull request: https://github.com/apache/hadoop/pull/196 YARN-6194 . Cluster capacity in SchedulingPolicy is updated only on allocation file reload This patch passes the ClusterNodeTracker instead of ClusterResource into the DRF policy. You can merge this pull request into a Git repository by running: $ git pull https://github.com/flyrain/hadoop yarn-6194 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hadoop/pull/196.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #196 commit 78c6303b15d5531fc0bd22331a0720803ad5c416 Author: Yufei Gu <yufei.gu@cloudera.com> Date: 2017-02-17T23:57:11Z YARN-6194 . Cluster capacity in SchedulingPolicy is updated only on allocation file reload.
          Hide
          kasha Karthik Kambatla added a comment -

          When this is fixed, we should remove the TODO in TestFairSchedulerPreemption.

          Show
          kasha Karthik Kambatla added a comment - When this is fixed, we should remove the TODO in TestFairSchedulerPreemption.

            People

            • Assignee:
              yufeigu Yufei Gu
              Reporter:
              kasha Karthik Kambatla
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development