Hadoop YARN
  1. Hadoop YARN
  2. YARN-474

CapacityScheduler does not activate applications when maximum-am-resource-percent configuration is refreshed

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.3-alpha, 0.23.6
    • Fix Version/s: 2.1.0-beta
    • Component/s: capacityscheduler
    • Labels:
      None

      Description

      Submit 3 applications to a cluster where capacity scheduler limits allow only 1 running application. Modify capacity scheduler config to increase value of yarn.scheduler.capacity.maximum-am-resource-percent and invoke refresh queues.

      The 2 applications not yet in running state do not get launched even though limits are increased.

      1. YARN-474.4.patch
        6 kB
        Zhijie Shen
      2. YARN-474.3.patch
        6 kB
        Zhijie Shen
      3. YARN-474.2.patch
        6 kB
        Zhijie Shen
      4. YARN-474.1.patch
        8 kB
        Zhijie Shen

        Issue Links

          Activity

          Hide
          Vinod Kumar Vavilapalli added a comment -

          Build passed, closing this. Thanks for the help, Konstantin Boudnik and Thomas Graves!

          Show
          Vinod Kumar Vavilapalli added a comment - Build passed, closing this. Thanks for the help, Konstantin Boudnik and Thomas Graves !
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Merged HADOOP-8415 and this one (again) into branch-2. Triggering branch-2 build..

          Show
          Vinod Kumar Vavilapalli added a comment - Merged HADOOP-8415 and this one (again) into branch-2. Triggering branch-2 build..
          Hide
          Vinod Kumar Vavilapalli added a comment -

          I looked at HADOOP-8415 which added the support for doubles in the configuration, it's useful, very low risk, I'm going to merge it into branch-2 and then this patch.

          Show
          Vinod Kumar Vavilapalli added a comment - I looked at HADOOP-8415 which added the support for doubles in the configuration, it's useful, very low risk, I'm going to merge it into branch-2 and then this patch.
          Hide
          Thomas Graves added a comment -

          I reverted the branch-2 change for now.

          Show
          Thomas Graves added a comment - I reverted the branch-2 change for now.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #1385 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1385/)
          YARN-474. Fix CapacityScheduler to trigger application-activation when am-resource-percent configuration is refreshed. Contributed by Zhijie Shen. (Revision 1461402)

          Result = SUCCESS
          vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1461402
          Files :

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1385 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1385/ ) YARN-474 . Fix CapacityScheduler to trigger application-activation when am-resource-percent configuration is refreshed. Contributed by Zhijie Shen. (Revision 1461402) Result = SUCCESS vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1461402 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #1357 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1357/)
          YARN-474. Fix CapacityScheduler to trigger application-activation when am-resource-percent configuration is refreshed. Contributed by Zhijie Shen. (Revision 1461402)

          Result = FAILURE
          vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1461402
          Files :

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1357 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1357/ ) YARN-474 . Fix CapacityScheduler to trigger application-activation when am-resource-percent configuration is refreshed. Contributed by Zhijie Shen. (Revision 1461402) Result = FAILURE vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1461402 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Yarn-trunk #168 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/168/)
          YARN-474. Fix CapacityScheduler to trigger application-activation when am-resource-percent configuration is refreshed. Contributed by Zhijie Shen. (Revision 1461402)

          Result = SUCCESS
          vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1461402
          Files :

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
          Show
          Hudson added a comment - Integrated in Hadoop-Yarn-trunk #168 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/168/ ) YARN-474 . Fix CapacityScheduler to trigger application-activation when am-resource-percent configuration is refreshed. Contributed by Zhijie Shen. (Revision 1461402) Result = SUCCESS vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1461402 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12575682/YARN-474.4.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/616//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12575682/YARN-474.4.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/616//console This message is automatically generated.
          Hide
          Zhijie Shen added a comment -

          Branch-2 failed because Configuration of it unexpectedly doesn't have setDouble. To be compatible to branch-2, we just need to simply change setDouble to setFloat in TestLeafQueue

          Show
          Zhijie Shen added a comment - Branch-2 failed because Configuration of it unexpectedly doesn't have setDouble. To be compatible to branch-2, we just need to simply change setDouble to setFloat in TestLeafQueue
          Hide
          Zhijie Shen added a comment -

          Branch-2 was failed by the patch.

          Show
          Zhijie Shen added a comment - Branch-2 was failed by the patch.
          Hide
          Konstantin Boudnik added a comment -

          Here's the error message:

          [ERROR] /home/jenkins/jenkins-slave/workspace/Hadoop-branch2/branch-2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java:[1610,10] cannot find symbol
          [ERROR] symbol  : method setDouble(java.lang.String,float)
          [ERROR] location: class org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration
          
          Show
          Konstantin Boudnik added a comment - Here's the error message: [ERROR] /home/jenkins/jenkins-slave/workspace/Hadoop-branch2/branch-2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java:[1610,10] cannot find symbol [ERROR] symbol : method setDouble(java.lang.String,float) [ERROR] location: class org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration
          Hide
          Konstantin Boudnik added a comment -

          It seems that this commit has broken the build of branch-2

          Show
          Konstantin Boudnik added a comment - It seems that this commit has broken the build of branch-2
          Hide
          Hudson added a comment -

          Integrated in Hadoop-trunk-Commit #3532 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3532/)
          YARN-474. Fix CapacityScheduler to trigger application-activation when am-resource-percent configuration is refreshed. Contributed by Zhijie Shen. (Revision 1461402)

          Result = SUCCESS
          vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1461402
          Files :

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
          Show
          Hudson added a comment - Integrated in Hadoop-trunk-Commit #3532 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3532/ ) YARN-474 . Fix CapacityScheduler to trigger application-activation when am-resource-percent configuration is refreshed. Contributed by Zhijie Shen. (Revision 1461402) Result = SUCCESS vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1461402 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
          Hide
          Vinod Kumar Vavilapalli added a comment -

          I just committed this to trunk and branch-2. Thanks Zhijie!

          Show
          Vinod Kumar Vavilapalli added a comment - I just committed this to trunk and branch-2. Thanks Zhijie!
          Hide
          Vinod Kumar Vavilapalli added a comment -

          The latest patch looks good, I am checking it in.

          Show
          Vinod Kumar Vavilapalli added a comment - The latest patch looks good, I am checking it in.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12575633/YARN-474.3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/613//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/613//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12575633/YARN-474.3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/613//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/613//console This message is automatically generated.
          Hide
          Zhijie Shen added a comment -

          @Vinod's comments are addressed the newest patch. In addition, I've tested the patch on one-node cluster, and seen it worked.

          Show
          Zhijie Shen added a comment - @Vinod's comments are addressed the newest patch. In addition, I've tested the patch on one-node cluster, and seen it worked.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Looked at the patch, looks good overall, few comments:

          • LeafQueue: The log-statement isn't useful in the general sense other than debugging this issue, we can remove it.
          • testActivateApplicationAfterReinitialization can be renamed to testActivateApplicationAfterQueueRefresh.
          • testActivateApplicationAfterReinitialization should be modified to explicitly verify changing of maximum-am-resource-percent.
          Show
          Vinod Kumar Vavilapalli added a comment - Looked at the patch, looks good overall, few comments: LeafQueue: The log-statement isn't useful in the general sense other than debugging this issue, we can remove it. testActivateApplicationAfterReinitialization can be renamed to testActivateApplicationAfterQueueRefresh. testActivateApplicationAfterReinitialization should be modified to explicitly verify changing of maximum-am-resource-percent.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12575555/YARN-474.2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/604//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/604//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12575555/YARN-474.2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/604//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/604//console This message is automatically generated.
          Hide
          Zhijie Shen added a comment -

          Separate the fix for the specific problem of YARN-474 and that of YARN-209, to make the other issue independent, though the two issues share the same root cause.

          Show
          Zhijie Shen added a comment - Separate the fix for the specific problem of YARN-474 and that of YARN-209 , to make the other issue independent, though the two issues share the same root cause.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12574695/YARN-474.1.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 tests included appear to have a timeout.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/556//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/556//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12574695/YARN-474.1.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 tests included appear to have a timeout. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/556//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/556//console This message is automatically generated.
          Hide
          Zhijie Shen added a comment -

          According to my investigation, the problem is that the applications in the pending list are only activated when application is added or removed.

          In contrast, when the metrics of a queue is updated (triggered by either refreshing configuration or updating cluster resource), the application activation function will not be called.

          So the solution is obviously that when the queue metrics are updated, we need to activate the pending applications if possible.

          Please have a look at the patch for the details. It also includes the corresponding tests.

          Show
          Zhijie Shen added a comment - According to my investigation, the problem is that the applications in the pending list are only activated when application is added or removed. In contrast, when the metrics of a queue is updated (triggered by either refreshing configuration or updating cluster resource), the application activation function will not be called. So the solution is obviously that when the queue metrics are updated, we need to activate the pending applications if possible. Please have a look at the patch for the details. It also includes the corresponding tests.

            People

            • Assignee:
              Zhijie Shen
              Reporter:
              Hitesh Shah
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development