Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      Currently, same ordering policy is used for pending applications and active applications. When priority is configured for an applications, during recovery high priority application get activated first. It is possible that low priority job was submitted and running state.
      This causes low priority job in starvation after recovery

      1. 0006-YARN-4479.patch
        21 kB
        Rohith Sharma K S
      2. 0005-YARN-4479.patch
        21 kB
        Rohith Sharma K S
      3. 0004-YARN-4479.patch
        22 kB
        Rohith Sharma K S
      4. 0004-YARN-4479.patch
        21 kB
        Rohith Sharma K S
      5. 0003-YARN-4479.patch
        25 kB
        Rohith Sharma K S
      6. 0002-YARN-4479.patch
        23 kB
        Rohith Sharma K S
      7. 0001-YARN-4479.patch
        28 kB
        Rohith Sharma K S

        Activity

        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Filed a ticket YARN-4617 for handling above mentioned points.

        Show
        rohithsharma Rohith Sharma K S added a comment - Filed a ticket YARN-4617 for handling above mentioned points.
        Hide
        Naganarasimha Naganarasimha G R added a comment -

        Thanks for pointing out Tan, Wangda,

        Instead of using queue's configured ordering-policy for pending apps, it should use FifoOrderingPolicyForPendingApps.

        Yes and having FairOrderingPolicy for the pending apps does not make sense as the getCachedUsed will be always zero . And as you said currently we can keep it fixed policy for now as there no real world need to make pending queues configurable. So i am ok with this approach .

        Show
        Naganarasimha Naganarasimha G R added a comment - Thanks for pointing out Tan, Wangda , Instead of using queue's configured ordering-policy for pending apps, it should use FifoOrderingPolicyForPendingApps. Yes and having FairOrderingPolicy for the pending apps does not make sense as the getCachedUsed will be always zero . And as you said currently we can keep it fixed policy for now as there no real world need to make pending queues configurable. So i am ok with this approach .
        Hide
        leftnoteasy Wangda Tan added a comment -

        Wangda Tan Would you mind if I take over new JIRA fixing all these issues?

        +1 to fix this in a separated JIRA

        Show
        leftnoteasy Wangda Tan added a comment - Wangda Tan Would you mind if I take over new JIRA fixing all these issues? +1 to fix this in a separated JIRA
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        But it was then crossed off. What was the reason ?

        Yes, we discussed it offline and crossed off. Reason is

        1. After YARN-3873, it was assumed that both active-&-pending should always share same policy. And in earlier comment in this jira Wangda Tan pointed out it should not be same.
        2. Another reason was, was thinking that to introduce a new configuration for pending ordering policy. Wangda suggests that we can have fixed ordering policy nevertheless of any active ordering policy.
        Show
        rohithsharma Rohith Sharma K S added a comment - But it was then crossed off. What was the reason ? Yes, we discussed it offline and crossed off. Reason is After YARN-3873 , it was assumed that both active-&-pending should always share same policy. And in earlier comment in this jira Wangda Tan pointed out it should not be same. Another reason was, was thinking that to introduce a new configuration for pending ordering policy. Wangda suggests that we can have fixed ordering policy nevertheless of any active ordering policy.
        Hide
        jianhe Jian He added a comment -

        Rohith Sharma K S, I think we had same discussion offline whether it's worth for pendingApps to have its own ordering policy. But it was then crossed off. What was the reason ? Is it because we thought pending apps and active apps should share most policy ? Now I agree this approach is better, if we agree that pending apps can just be treated separately,

        Show
        jianhe Jian He added a comment - Rohith Sharma K S , I think we had same discussion offline whether it's worth for pendingApps to have its own ordering policy. But it was then crossed off. What was the reason ? Is it because we thought pending apps and active apps should share most policy ? Now I agree this approach is better, if we agree that pending apps can just be treated separately,
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Right, I think we can spin it as improvement since there were few JIRA went on top of this.
        Wangda Tan Would you mind if I take over new JIRA fixing all these issues?

        Show
        rohithsharma Rohith Sharma K S added a comment - Right, I think we can spin it as improvement since there were few JIRA went on top of this. Wangda Tan Would you mind if I take over new JIRA fixing all these issues?
        Hide
        sunilg Sunil G added a comment -

        Yes. Makes sense to me. +1 for the approach.
        I think we can spin off this in a new ticket, because few patches went on top of this change. So reverting may be complex.

        Thoughts?

        Show
        sunilg Sunil G added a comment - Yes. Makes sense to me. +1 for the approach. I think we can spin off this in a new ticket, because few patches went on top of this change. So reverting may be complex. Thoughts?
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        I think we still need to treat ordering policy of pending apps specially, and with the new added FifoOrderingPolicyForPendingApps, we don't need to reset isRecoverying and other ordering policies don't need the isRecoverying field, so it won't affect performance.

        +1 for the approach, makesense to me

        Show
        rohithsharma Rohith Sharma K S added a comment - I think we still need to treat ordering policy of pending apps specially, and with the new added FifoOrderingPolicyForPendingApps, we don't need to reset isRecoverying and other ordering policies don't need the isRecoverying field, so it won't affect performance. +1 for the approach, makesense to me
        Hide
        leftnoteasy Wangda Tan added a comment -

        Hi Sunil G/Rohith Sharma K S,

        Please note one of my comment above:

        Instead of using queue's configured ordering-policy for pending apps, it should use FifoOrderingPolicyForPendingApps.
        with the change, user cannot configure ordering-policy for pending-apps, I didn't see a strong real life requirement for that now.
        

        I think we still need to treat ordering policy of pending apps specially, and with the new added FifoOrderingPolicyForPendingApps, we don't need to reset isRecoverying and other ordering policies don't need the isRecoverying field, so it won't affect performance.

        2. With new FifoOrderingPolicyForPendingApps, we are agreeing that Priority and Submission Time will be factor to decide pending apps ordering, correct?

        Yes, the only difference between FifoOrderingPolicyForPendingApps and FifoOrderingPolicy is FifoOrderingPolicyForPendingApps let recovering apps go first.

        Show
        leftnoteasy Wangda Tan added a comment - Hi Sunil G / Rohith Sharma K S , Please note one of my comment above: Instead of using queue's configured ordering-policy for pending apps, it should use FifoOrderingPolicyForPendingApps. with the change, user cannot configure ordering-policy for pending-apps, I didn't see a strong real life requirement for that now. I think we still need to treat ordering policy of pending apps specially, and with the new added FifoOrderingPolicyForPendingApps, we don't need to reset isRecoverying and other ordering policies don't need the isRecoverying field, so it won't affect performance. 2. With new FifoOrderingPolicyForPendingApps, we are agreeing that Priority and Submission Time will be factor to decide pending apps ordering, correct? Yes, the only difference between FifoOrderingPolicyForPendingApps and FifoOrderingPolicy is FifoOrderingPolicyForPendingApps let recovering apps go first.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Hi Wangda Tan
        There were 2 approaches to solve this

        1. Handle this problem in upper layer i.e LeafQueue layer nevertheless of any ordering policy.0006-YARN-4479.patch
        2. Handle in very bottom layer i.e specific to ordering policy. If any new ordering policy added then that has to take care of this JIRA problem. 0003-YARN-4479.patch
          Issues in approach-2 is
          1. Performance : It does comparison for wasAttemptRecovering on every addition or removal. When there are huge number of applications, it is significantly impact.
          2. The flag need to reset while adding to activeApplication list.
        Show
        rohithsharma Rohith Sharma K S added a comment - Hi Wangda Tan There were 2 approaches to solve this Handle this problem in upper layer i.e LeafQueue layer nevertheless of any ordering policy. 0006-YARN-4479.patch Handle in very bottom layer i.e specific to ordering policy. If any new ordering policy added then that has to take care of this JIRA problem. 0003-YARN-4479.patch Issues in approach-2 is Performance : It does comparison for wasAttemptRecovering on every addition or removal. When there are huge number of applications, it is significantly impact. The flag need to reset while adding to activeApplication list.
        Hide
        sunilg Sunil G added a comment -

        Thanks Wangda Tan for the detailed comment.
        Yes, I understood the reason behind the same when FairOrderingPolicy is used. Generally fine with approach suggested. However few minor points.

        1.

        Keep changes for SchedulerApplicationAttempt/CapacityScheduler in the patch: 0006-YARN-4479.patch to set "isRecoverying" field

        We already have a recovering field in SchedulerApplicationAttempt. But if we make use of this, we need to reset the flag once application is moved to activeApplications list. So if we reset this flag, we will loose information whether this was a recovered app. As of now I am not seeing any use with this, but it may confuse. So cud we introduce a new flag/field for this to avoid confusion.

        2. With new FifoOrderingPolicyForPendingApps, we are agreeing that Priority and Submission Time will be factor to decide pending apps ordering, correct?

        Show
        sunilg Sunil G added a comment - Thanks Wangda Tan for the detailed comment. Yes, I understood the reason behind the same when FairOrderingPolicy is used. Generally fine with approach suggested. However few minor points. 1. Keep changes for SchedulerApplicationAttempt/CapacityScheduler in the patch: 0006- YARN-4479 .patch to set "isRecoverying" field We already have a recovering field in SchedulerApplicationAttempt. But if we make use of this, we need to reset the flag once application is moved to activeApplications list. So if we reset this flag, we will loose information whether this was a recovered app. As of now I am not seeing any use with this, but it may confuse. So cud we introduce a new flag/field for this to avoid confusion. 2. With new FifoOrderingPolicyForPendingApps, we are agreeing that Priority and Submission Time will be factor to decide pending apps ordering, correct?
        Hide
        leftnoteasy Wangda Tan added a comment -

        Thanks Sunil G/Naganarasimha G R,

        I looked at existing code again, I still think we should make such logic existed in orderingPolicy only.

        Such logic increases complexities of LeafQueue, and it has to handle the new added pendingOPForRecoveredApps In many places, such as:

        765	      if (application.isAttemptRecovering()) {
        766	        pendingOPForRecoveredApps.removeSchedulableEntity(application);
        767	      } else {
        768	        pendingOrderingPolicy.removeSchedulableEntity(application);
        769	      }
        

        And

        1566	    for (FiCaSchedulerApp pendingApp : pendingOPForRecoveredApps
        1567	        .getSchedulableEntities()) {
        1568	      apps.add(pendingApp.getApplicationAttemptId());
        1569	    }
        1538	    for (FiCaSchedulerApp pendingApp : pendingOrderingPolicy	1570	    for (FiCaSchedulerApp pendingApp : pendingOrderingPolicy
        

        In addition to this problem, I just noticed another issue of pending-ordering-policy introduced by YARN-3873.
        It assumes queue's ordering-policy for pending apps should as same as ordering-policy for active apps. But actually it doesn't, for example, pending-ordering-policy of fair-ordering-policy should be FIFO instead of FAIR.

        Some ideas on top of my mind to fix the above issue and cleanup code.

        • Keep changes for SchedulerApplicationAttempt/CapacityScheduler in the patch: 0006-YARN-4479.patch to set "isRecoverying" field
        • Add a RecoveryComparator, and add a new FifoOrderingPolicyForPendingApps which extends FifoOrderingPolicy but uses RecoveryComparator
        • Instead of using queue's configured ordering-policy for pending apps, it should use FifoOrderingPolicyForPendingApps.
          with the change, user cannot configure ordering-policy for pending-apps, I didn't see a strong real life requirement for that now.

        Thoughts?

        Show
        leftnoteasy Wangda Tan added a comment - Thanks Sunil G / Naganarasimha G R , I looked at existing code again, I still think we should make such logic existed in orderingPolicy only. Such logic increases complexities of LeafQueue, and it has to handle the new added pendingOPForRecoveredApps In many places, such as: 765 if (application.isAttemptRecovering()) { 766 pendingOPForRecoveredApps.removeSchedulableEntity(application); 767 } else { 768 pendingOrderingPolicy.removeSchedulableEntity(application); 769 } And 1566 for (FiCaSchedulerApp pendingApp : pendingOPForRecoveredApps 1567 .getSchedulableEntities()) { 1568 apps.add(pendingApp.getApplicationAttemptId()); 1569 } 1538 for (FiCaSchedulerApp pendingApp : pendingOrderingPolicy 1570 for (FiCaSchedulerApp pendingApp : pendingOrderingPolicy In addition to this problem, I just noticed another issue of pending-ordering-policy introduced by YARN-3873 . It assumes queue's ordering-policy for pending apps should as same as ordering-policy for active apps. But actually it doesn't, for example, pending-ordering-policy of fair-ordering-policy should be FIFO instead of FAIR. Some ideas on top of my mind to fix the above issue and cleanup code. Keep changes for SchedulerApplicationAttempt/CapacityScheduler in the patch: 0006- YARN-4479 .patch to set "isRecoverying" field Add a RecoveryComparator, and add a new FifoOrderingPolicyForPendingApps which extends FifoOrderingPolicy but uses RecoveryComparator Instead of using queue's configured ordering-policy for pending apps, it should use FifoOrderingPolicyForPendingApps. with the change, user cannot configure ordering-policy for pending-apps, I didn't see a strong real life requirement for that now. Thoughts?
        Hide
        sunilg Sunil G added a comment -

        Yes Wangda Tan, as mentioned by NGarla_Unused , this option came up as a possible solution. However, there were few complexities:

        For this approach, we Needed a new RecoveryComparator. This has to be added to FifoOrderingPolicy also. RecoveryComparator was supposed to run with the information that whether this app was running prior to recovery. So a flag has to be added to FicaSchedulerApp, and then reset the same after first round of activation. Hence more complexities in various part of scheduler was needed for this approach. So a simpler approach is made in LeafQueue. Pls share your thoughts if we missed any in this approach.
        Rohith Sharma K S Could u pls add if I missed any point for this approach.

        Show
        sunilg Sunil G added a comment - Yes Wangda Tan , as mentioned by NGarla_Unused , this option came up as a possible solution. However, there were few complexities: For this approach, we Needed a new RecoveryComparator . This has to be added to FifoOrderingPolicy also. RecoveryComparator was supposed to run with the information that whether this app was running prior to recovery. So a flag has to be added to FicaSchedulerApp, and then reset the same after first round of activation. Hence more complexities in various part of scheduler was needed for this approach. So a simpler approach is made in LeafQueue. Pls share your thoughts if we missed any in this approach. Rohith Sharma K S Could u pls add if I missed any point for this approach.
        Hide
        Naganarasimha Naganarasimha G R added a comment -

        I think you are referring to the approach similar to the one done in 0002-YARN-4479.patch ? having additional logic in the comparator which checks whether the attempt was wasAttemptRunningEarlier. After discussion we tried to avoid it as unnecessary comparisions happen s even after recovery when comparing each app. If you have any other approach may be we can discuss further

        Show
        Naganarasimha Naganarasimha G R added a comment - I think you are referring to the approach similar to the one done in 0002- YARN-4479 .patch ? having additional logic in the comparator which checks whether the attempt was wasAttemptRunningEarlier. After discussion we tried to avoid it as unnecessary comparisions happen s even after recovery when comparing each app. If you have any other approach may be we can discuss further
        Hide
        leftnoteasy Wangda Tan added a comment -

        Hi Rohith Sharma K S,

        Apologize for my very late feedback. Instead of adding a new list of recovery-and-pending-apps, could we add this behavior (early submitted & running apps goes first) to our existing policy? Maintaining only one ordering policy in LeafQueue is easier.

        Thoughts? Jian He/Naganarasimha G R/Sunil G

        Show
        leftnoteasy Wangda Tan added a comment - Hi Rohith Sharma K S , Apologize for my very late feedback. Instead of adding a new list of recovery-and-pending-apps, could we add this behavior (early submitted & running apps goes first) to our existing policy? Maintaining only one ordering policy in LeafQueue is easier. Thoughts? Jian He / Naganarasimha G R / Sunil G
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #9075 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9075/)
        YARN-4479. Change CS LeafQueue pendingOrderingPolicy to hornor recovered (jianhe: rev 109e528ef5d8df07443373751266b4417acc981a)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationPriority.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        • hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #9075 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9075/ ) YARN-4479 . Change CS LeafQueue pendingOrderingPolicy to hornor recovered (jianhe: rev 109e528ef5d8df07443373751266b4417acc981a) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationPriority.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml
        Hide
        jianhe Jian He added a comment -

        Committed to trunk , branch-2, branch-2.8, thanks Rohith!

        Thanks Sunil G, Naganarasimha G R for reviewing the patch !

        Show
        jianhe Jian He added a comment - Committed to trunk , branch-2, branch-2.8, thanks Rohith! Thanks Sunil G , Naganarasimha G R for reviewing the patch !
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        test cases are unrelated to this patch. These tests failures will be handled in YARN-4478

        Show
        rohithsharma Rohith Sharma K S added a comment - test cases are unrelated to this patch. These tests failures will be handled in YARN-4478
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 0s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
        +1 mvninstall 7m 51s trunk passed
        +1 compile 1m 58s trunk passed with JDK v1.8.0_66
        +1 compile 2m 16s trunk passed with JDK v1.7.0_91
        +1 checkstyle 0m 30s trunk passed
        +1 mvnsite 2m 43s trunk passed
        +1 mvneclipse 0m 21s trunk passed
        -1 findbugs 6m 39s branch/hadoop-yarn-project/hadoop-yarn no findbugs output file (hadoop-yarn-project/hadoop-yarn/target/findbugsXml.xml)
        +1 javadoc 1m 57s trunk passed with JDK v1.8.0_66
        +1 javadoc 4m 32s trunk passed with JDK v1.7.0_91
        +1 mvninstall 2m 11s the patch passed
        +1 compile 1m 56s the patch passed with JDK v1.8.0_66
        +1 javac 1m 56s the patch passed
        +1 compile 2m 14s the patch passed with JDK v1.7.0_91
        +1 javac 2m 14s the patch passed
        +1 checkstyle 0m 32s the patch passed
        +1 mvnsite 2m 40s the patch passed
        +1 mvneclipse 0m 19s the patch passed
        +1 whitespace 0m 0s Patch has no whitespace issues.
        +1 xml 0m 0s The patch has no ill-formed XML file.
        -1 findbugs 6m 40s patch/hadoop-yarn-project/hadoop-yarn no findbugs output file (hadoop-yarn-project/hadoop-yarn/target/findbugsXml.xml)
        +1 javadoc 1m 56s the patch passed with JDK v1.8.0_66
        +1 javadoc 4m 31s the patch passed with JDK v1.7.0_91
        -1 unit 75m 29s hadoop-yarn in the patch failed with JDK v1.8.0_66.
        -1 unit 80m 25s hadoop-yarn in the patch failed with JDK v1.7.0_91.
        +1 asflicense 0m 22s Patch does not generate ASF License warnings.
        209m 2s



        Reason Tests
        JDK v1.8.0_66 Failed junit tests hadoop.yarn.server.resourcemanager.TestAMAuthorization
          hadoop.yarn.server.resourcemanager.TestClientRMTokens
        JDK v1.7.0_91 Failed junit tests hadoop.yarn.server.resourcemanager.TestAMAuthorization
          hadoop.yarn.server.resourcemanager.TestClientRMTokens



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:0ca8df7
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12780671/0006-YARN-4479.patch
        JIRA Issue YARN-4479
        Optional Tests asflicense findbugs xml compile javac javadoc mvninstall mvnsite unit checkstyle
        uname Linux b4ddfef5977c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / c52b407
        Default Java 1.7.0_91
        Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91
        unit https://builds.apache.org/job/PreCommit-YARN-Build/10167/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.8.0_66.txt
        unit https://builds.apache.org/job/PreCommit-YARN-Build/10167/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.7.0_91.txt
        unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10167/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-YARN-Build/10167/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.7.0_91.txt
        JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10167/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn
        Max memory used 76MB
        Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/10167/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 7m 51s trunk passed +1 compile 1m 58s trunk passed with JDK v1.8.0_66 +1 compile 2m 16s trunk passed with JDK v1.7.0_91 +1 checkstyle 0m 30s trunk passed +1 mvnsite 2m 43s trunk passed +1 mvneclipse 0m 21s trunk passed -1 findbugs 6m 39s branch/hadoop-yarn-project/hadoop-yarn no findbugs output file (hadoop-yarn-project/hadoop-yarn/target/findbugsXml.xml) +1 javadoc 1m 57s trunk passed with JDK v1.8.0_66 +1 javadoc 4m 32s trunk passed with JDK v1.7.0_91 +1 mvninstall 2m 11s the patch passed +1 compile 1m 56s the patch passed with JDK v1.8.0_66 +1 javac 1m 56s the patch passed +1 compile 2m 14s the patch passed with JDK v1.7.0_91 +1 javac 2m 14s the patch passed +1 checkstyle 0m 32s the patch passed +1 mvnsite 2m 40s the patch passed +1 mvneclipse 0m 19s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 xml 0m 0s The patch has no ill-formed XML file. -1 findbugs 6m 40s patch/hadoop-yarn-project/hadoop-yarn no findbugs output file (hadoop-yarn-project/hadoop-yarn/target/findbugsXml.xml) +1 javadoc 1m 56s the patch passed with JDK v1.8.0_66 +1 javadoc 4m 31s the patch passed with JDK v1.7.0_91 -1 unit 75m 29s hadoop-yarn in the patch failed with JDK v1.8.0_66. -1 unit 80m 25s hadoop-yarn in the patch failed with JDK v1.7.0_91. +1 asflicense 0m 22s Patch does not generate ASF License warnings. 209m 2s Reason Tests JDK v1.8.0_66 Failed junit tests hadoop.yarn.server.resourcemanager.TestAMAuthorization   hadoop.yarn.server.resourcemanager.TestClientRMTokens JDK v1.7.0_91 Failed junit tests hadoop.yarn.server.resourcemanager.TestAMAuthorization   hadoop.yarn.server.resourcemanager.TestClientRMTokens Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12780671/0006-YARN-4479.patch JIRA Issue YARN-4479 Optional Tests asflicense findbugs xml compile javac javadoc mvninstall mvnsite unit checkstyle uname Linux b4ddfef5977c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / c52b407 Default Java 1.7.0_91 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91 unit https://builds.apache.org/job/PreCommit-YARN-Build/10167/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.8.0_66.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/10167/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.7.0_91.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10167/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-YARN-Build/10167/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.7.0_91.txt JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10167/testReport/ modules C: hadoop-yarn-project/hadoop-yarn hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn Max memory used 76MB Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-YARN-Build/10167/console This message was automatically generated.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Updating the patch fixing test failure for TestApplicationLimits.
        Other test failures are unrelated the patch
        Findbugs warning : -1 is due to file not found. I think it is unrelated the patch

        Show
        rohithsharma Rohith Sharma K S added a comment - Updating the patch fixing test failure for TestApplicationLimits. Other test failures are unrelated the patch Findbugs warning : -1 is due to file not found. I think it is unrelated the patch
        Hide
        jianhe Jian He added a comment -

        +1, Rohith Sharma K S, could you see if the warnings are related

        Show
        jianhe Jian He added a comment - +1, Rohith Sharma K S , could you see if the warnings are related
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 0s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
        +1 mvninstall 7m 49s trunk passed
        +1 compile 2m 20s trunk passed with JDK v1.8.0_66
        +1 compile 2m 6s trunk passed with JDK v1.7.0_91
        +1 checkstyle 0m 30s trunk passed
        +1 mvnsite 2m 37s trunk passed
        +1 mvneclipse 0m 18s trunk passed
        -1 findbugs 6m 18s branch/hadoop-yarn-project/hadoop-yarn no findbugs output file (hadoop-yarn-project/hadoop-yarn/target/findbugsXml.xml)
        +1 javadoc 1m 53s trunk passed with JDK v1.8.0_66
        +1 javadoc 4m 31s trunk passed with JDK v1.7.0_91
        +1 mvninstall 2m 6s the patch passed
        +1 compile 1m 49s the patch passed with JDK v1.8.0_66
        +1 javac 1m 49s the patch passed
        +1 compile 2m 5s the patch passed with JDK v1.7.0_91
        +1 javac 2m 5s the patch passed
        +1 checkstyle 0m 29s the patch passed
        +1 mvnsite 2m 34s the patch passed
        +1 mvneclipse 0m 18s the patch passed
        +1 whitespace 0m 0s Patch has no whitespace issues.
        +1 xml 0m 0s The patch has no ill-formed XML file.
        -1 findbugs 6m 21s patch/hadoop-yarn-project/hadoop-yarn no findbugs output file (hadoop-yarn-project/hadoop-yarn/target/findbugsXml.xml)
        +1 javadoc 1m 50s the patch passed with JDK v1.8.0_66
        +1 javadoc 4m 14s the patch passed with JDK v1.7.0_91
        -1 unit 83m 27s hadoop-yarn in the patch failed with JDK v1.8.0_66.
        -1 unit 84m 51s hadoop-yarn in the patch failed with JDK v1.7.0_91.
        +1 asflicense 0m 24s Patch does not generate ASF License warnings.
        219m 58s



        Reason Tests
        JDK v1.8.0_66 Failed junit tests hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits
          hadoop.yarn.server.resourcemanager.TestClientRMTokens
          hadoop.yarn.server.resourcemanager.scheduler.fifo.TestFifoScheduler
          hadoop.yarn.server.resourcemanager.TestAMAuthorization
        JDK v1.7.0_91 Failed junit tests hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits
          hadoop.yarn.server.resourcemanager.TestClientRMTokens
          hadoop.yarn.server.resourcemanager.TestAMAuthorization
          hadoop.yarn.server.nodemanager.containermanager.TestContainerManager



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:0ca8df7
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12780566/0005-YARN-4479.patch
        JIRA Issue YARN-4479
        Optional Tests asflicense findbugs xml compile javac javadoc mvninstall mvnsite unit checkstyle
        uname Linux 60a88099b65c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 96d8f1d
        Default Java 1.7.0_91
        Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91
        unit https://builds.apache.org/job/PreCommit-YARN-Build/10154/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.8.0_66.txt
        unit https://builds.apache.org/job/PreCommit-YARN-Build/10154/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.7.0_91.txt
        unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10154/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-YARN-Build/10154/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.7.0_91.txt
        JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10154/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn
        Max memory used 76MB
        Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/10154/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 7m 49s trunk passed +1 compile 2m 20s trunk passed with JDK v1.8.0_66 +1 compile 2m 6s trunk passed with JDK v1.7.0_91 +1 checkstyle 0m 30s trunk passed +1 mvnsite 2m 37s trunk passed +1 mvneclipse 0m 18s trunk passed -1 findbugs 6m 18s branch/hadoop-yarn-project/hadoop-yarn no findbugs output file (hadoop-yarn-project/hadoop-yarn/target/findbugsXml.xml) +1 javadoc 1m 53s trunk passed with JDK v1.8.0_66 +1 javadoc 4m 31s trunk passed with JDK v1.7.0_91 +1 mvninstall 2m 6s the patch passed +1 compile 1m 49s the patch passed with JDK v1.8.0_66 +1 javac 1m 49s the patch passed +1 compile 2m 5s the patch passed with JDK v1.7.0_91 +1 javac 2m 5s the patch passed +1 checkstyle 0m 29s the patch passed +1 mvnsite 2m 34s the patch passed +1 mvneclipse 0m 18s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 xml 0m 0s The patch has no ill-formed XML file. -1 findbugs 6m 21s patch/hadoop-yarn-project/hadoop-yarn no findbugs output file (hadoop-yarn-project/hadoop-yarn/target/findbugsXml.xml) +1 javadoc 1m 50s the patch passed with JDK v1.8.0_66 +1 javadoc 4m 14s the patch passed with JDK v1.7.0_91 -1 unit 83m 27s hadoop-yarn in the patch failed with JDK v1.8.0_66. -1 unit 84m 51s hadoop-yarn in the patch failed with JDK v1.7.0_91. +1 asflicense 0m 24s Patch does not generate ASF License warnings. 219m 58s Reason Tests JDK v1.8.0_66 Failed junit tests hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits   hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.scheduler.fifo.TestFifoScheduler   hadoop.yarn.server.resourcemanager.TestAMAuthorization JDK v1.7.0_91 Failed junit tests hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits   hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization   hadoop.yarn.server.nodemanager.containermanager.TestContainerManager Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12780566/0005-YARN-4479.patch JIRA Issue YARN-4479 Optional Tests asflicense findbugs xml compile javac javadoc mvninstall mvnsite unit checkstyle uname Linux 60a88099b65c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 96d8f1d Default Java 1.7.0_91 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91 unit https://builds.apache.org/job/PreCommit-YARN-Build/10154/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.8.0_66.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/10154/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.7.0_91.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10154/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-YARN-Build/10154/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.7.0_91.txt JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10154/testReport/ modules C: hadoop-yarn-project/hadoop-yarn hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn Max memory used 76MB Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-YARN-Build/10154/console This message was automatically generated.
        Hide
        sunilg Sunil G added a comment -

        Patch looks good Rohith Sharma K S.

        Show
        sunilg Sunil G added a comment - Patch looks good Rohith Sharma K S .
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        updated 0005-YARN-4479 patch , kindly review it

        Show
        rohithsharma Rohith Sharma K S added a comment - updated 0005- YARN-4479 patch , kindly review it
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 0s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
        +1 mvninstall 7m 33s trunk passed
        +1 compile 1m 54s trunk passed with JDK v1.8.0_66
        +1 compile 2m 16s trunk passed with JDK v1.7.0_91
        +1 checkstyle 0m 32s trunk passed
        +1 mvnsite 2m 36s trunk passed
        +1 mvneclipse 0m 20s trunk passed
        -1 findbugs 6m 32s branch/hadoop-yarn-project/hadoop-yarn no findbugs output file (hadoop-yarn-project/hadoop-yarn/target/findbugsXml.xml)
        +1 javadoc 1m 55s trunk passed with JDK v1.8.0_66
        +1 javadoc 4m 22s trunk passed with JDK v1.7.0_91
        +1 mvninstall 2m 10s the patch passed
        +1 compile 2m 1s the patch passed with JDK v1.8.0_66
        +1 javac 2m 1s the patch passed
        +1 compile 2m 6s the patch passed with JDK v1.7.0_91
        +1 javac 2m 6s the patch passed
        -1 checkstyle 0m 31s Patch generated 1 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 357, now 353).
        +1 mvnsite 2m 34s the patch passed
        +1 mvneclipse 0m 18s the patch passed
        -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix.
        +1 xml 0m 1s The patch has no ill-formed XML file.
        -1 findbugs 6m 19s patch/hadoop-yarn-project/hadoop-yarn no findbugs output file (hadoop-yarn-project/hadoop-yarn/target/findbugsXml.xml)
        +1 javadoc 1m 48s the patch passed with JDK v1.8.0_66
        +1 javadoc 4m 18s the patch passed with JDK v1.7.0_91
        -1 unit 80m 54s hadoop-yarn in the patch failed with JDK v1.8.0_66.
        -1 unit 82m 56s hadoop-yarn in the patch failed with JDK v1.7.0_91.
        +1 asflicense 0m 19s Patch does not generate ASF License warnings.
        215m 20s



        Reason Tests
        JDK v1.8.0_66 Failed junit tests hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits
          hadoop.yarn.server.resourcemanager.TestClientRMTokens
          hadoop.yarn.server.resourcemanager.TestAMAuthorization
        JDK v1.7.0_91 Failed junit tests hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
          hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits
          hadoop.yarn.server.resourcemanager.TestClientRMTokens
          hadoop.yarn.server.resourcemanager.TestAMAuthorization



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:0ca8df7
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12780288/0004-YARN-4479.patch
        JIRA Issue YARN-4479
        Optional Tests asflicense findbugs xml compile javac javadoc mvninstall mvnsite unit checkstyle
        uname Linux 5ed19af0089b 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 7dafee1
        Default Java 1.7.0_91
        Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/10142/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
        whitespace https://builds.apache.org/job/PreCommit-YARN-Build/10142/artifact/patchprocess/whitespace-eol.txt
        unit https://builds.apache.org/job/PreCommit-YARN-Build/10142/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.8.0_66.txt
        unit https://builds.apache.org/job/PreCommit-YARN-Build/10142/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.7.0_91.txt
        unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10142/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-YARN-Build/10142/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.7.0_91.txt
        JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10142/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn
        Max memory used 76MB
        Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/10142/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 7m 33s trunk passed +1 compile 1m 54s trunk passed with JDK v1.8.0_66 +1 compile 2m 16s trunk passed with JDK v1.7.0_91 +1 checkstyle 0m 32s trunk passed +1 mvnsite 2m 36s trunk passed +1 mvneclipse 0m 20s trunk passed -1 findbugs 6m 32s branch/hadoop-yarn-project/hadoop-yarn no findbugs output file (hadoop-yarn-project/hadoop-yarn/target/findbugsXml.xml) +1 javadoc 1m 55s trunk passed with JDK v1.8.0_66 +1 javadoc 4m 22s trunk passed with JDK v1.7.0_91 +1 mvninstall 2m 10s the patch passed +1 compile 2m 1s the patch passed with JDK v1.8.0_66 +1 javac 2m 1s the patch passed +1 compile 2m 6s the patch passed with JDK v1.7.0_91 +1 javac 2m 6s the patch passed -1 checkstyle 0m 31s Patch generated 1 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 357, now 353). +1 mvnsite 2m 34s the patch passed +1 mvneclipse 0m 18s the patch passed -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 xml 0m 1s The patch has no ill-formed XML file. -1 findbugs 6m 19s patch/hadoop-yarn-project/hadoop-yarn no findbugs output file (hadoop-yarn-project/hadoop-yarn/target/findbugsXml.xml) +1 javadoc 1m 48s the patch passed with JDK v1.8.0_66 +1 javadoc 4m 18s the patch passed with JDK v1.7.0_91 -1 unit 80m 54s hadoop-yarn in the patch failed with JDK v1.8.0_66. -1 unit 82m 56s hadoop-yarn in the patch failed with JDK v1.7.0_91. +1 asflicense 0m 19s Patch does not generate ASF License warnings. 215m 20s Reason Tests JDK v1.8.0_66 Failed junit tests hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits   hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization JDK v1.7.0_91 Failed junit tests hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler   hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits   hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12780288/0004-YARN-4479.patch JIRA Issue YARN-4479 Optional Tests asflicense findbugs xml compile javac javadoc mvninstall mvnsite unit checkstyle uname Linux 5ed19af0089b 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 7dafee1 Default Java 1.7.0_91 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/10142/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt whitespace https://builds.apache.org/job/PreCommit-YARN-Build/10142/artifact/patchprocess/whitespace-eol.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/10142/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.8.0_66.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/10142/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.7.0_91.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10142/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-YARN-Build/10142/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.7.0_91.txt JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10142/testReport/ modules C: hadoop-yarn-project/hadoop-yarn hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn Max memory used 76MB Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-YARN-Build/10142/console This message was automatically generated.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Updated the patch fixing some of checkstyle and findbugs warnings.

        Show
        rohithsharma Rohith Sharma K S added a comment - Updated the patch fixing some of checkstyle and findbugs warnings.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 0s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
        +1 mvninstall 7m 27s trunk passed
        +1 compile 0m 29s trunk passed with JDK v1.8.0_66
        +1 compile 0m 31s trunk passed with JDK v1.7.0_91
        +1 checkstyle 0m 17s trunk passed
        +1 mvnsite 0m 36s trunk passed
        +1 mvneclipse 0m 15s trunk passed
        +1 findbugs 1m 12s trunk passed
        +1 javadoc 0m 23s trunk passed with JDK v1.8.0_66
        +1 javadoc 0m 28s trunk passed with JDK v1.7.0_91
        +1 mvninstall 0m 31s the patch passed
        +1 compile 0m 25s the patch passed with JDK v1.8.0_66
        +1 javac 0m 25s the patch passed
        +1 compile 0m 30s the patch passed with JDK v1.7.0_91
        +1 javac 0m 30s the patch passed
        -1 checkstyle 0m 18s Patch generated 5 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager (total was 357, now 356).
        +1 mvnsite 0m 34s the patch passed
        +1 mvneclipse 0m 13s the patch passed
        +1 whitespace 0m 0s Patch has no whitespace issues.
        -1 findbugs 1m 21s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager introduced 2 new FindBugs issues.
        +1 javadoc 0m 20s the patch passed with JDK v1.8.0_66
        +1 javadoc 0m 25s the patch passed with JDK v1.7.0_91
        -1 unit 63m 27s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66.
        -1 unit 64m 43s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91.
        +1 asflicense 0m 19s Patch does not generate ASF License warnings.
        145m 49s



        Reason Tests
        FindBugs module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
          Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.pendingOrderingPolicy; locked 87% of time Unsynchronized access at LeafQueue.java:87% of time Unsynchronized access at LeafQueue.java:[line 1517]
          Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.pendingOrderingPolicyRecovery; locked 87% of time Unsynchronized access at LeafQueue.java:87% of time Unsynchronized access at LeafQueue.java:[line 1519]
        JDK v1.8.0_66 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens
          hadoop.yarn.server.resourcemanager.TestAMAuthorization
        JDK v1.7.0_91 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens
          hadoop.yarn.server.resourcemanager.TestAMAuthorization



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:0ca8df7
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12780101/0004-YARN-4479.patch
        JIRA Issue YARN-4479
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux aa9ba4718f2a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 4e4b3a8
        Default Java 1.7.0_91
        Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/10134/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
        findbugs https://builds.apache.org/job/PreCommit-YARN-Build/10134/artifact/patchprocess/new-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.html
        unit https://builds.apache.org/job/PreCommit-YARN-Build/10134/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt
        unit https://builds.apache.org/job/PreCommit-YARN-Build/10134/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91.txt
        unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10134/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-YARN-Build/10134/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91.txt
        JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10134/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
        Max memory used 76MB
        Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/10134/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 7m 27s trunk passed +1 compile 0m 29s trunk passed with JDK v1.8.0_66 +1 compile 0m 31s trunk passed with JDK v1.7.0_91 +1 checkstyle 0m 17s trunk passed +1 mvnsite 0m 36s trunk passed +1 mvneclipse 0m 15s trunk passed +1 findbugs 1m 12s trunk passed +1 javadoc 0m 23s trunk passed with JDK v1.8.0_66 +1 javadoc 0m 28s trunk passed with JDK v1.7.0_91 +1 mvninstall 0m 31s the patch passed +1 compile 0m 25s the patch passed with JDK v1.8.0_66 +1 javac 0m 25s the patch passed +1 compile 0m 30s the patch passed with JDK v1.7.0_91 +1 javac 0m 30s the patch passed -1 checkstyle 0m 18s Patch generated 5 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager (total was 357, now 356). +1 mvnsite 0m 34s the patch passed +1 mvneclipse 0m 13s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. -1 findbugs 1m 21s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager introduced 2 new FindBugs issues. +1 javadoc 0m 20s the patch passed with JDK v1.8.0_66 +1 javadoc 0m 25s the patch passed with JDK v1.7.0_91 -1 unit 63m 27s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. -1 unit 64m 43s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. +1 asflicense 0m 19s Patch does not generate ASF License warnings. 145m 49s Reason Tests FindBugs module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager   Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.pendingOrderingPolicy; locked 87% of time Unsynchronized access at LeafQueue.java:87% of time Unsynchronized access at LeafQueue.java: [line 1517]   Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.pendingOrderingPolicyRecovery; locked 87% of time Unsynchronized access at LeafQueue.java:87% of time Unsynchronized access at LeafQueue.java: [line 1519] JDK v1.8.0_66 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization JDK v1.7.0_91 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12780101/0004-YARN-4479.patch JIRA Issue YARN-4479 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux aa9ba4718f2a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 4e4b3a8 Default Java 1.7.0_91 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/10134/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt findbugs https://builds.apache.org/job/PreCommit-YARN-Build/10134/artifact/patchprocess/new-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.html unit https://builds.apache.org/job/PreCommit-YARN-Build/10134/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/10134/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10134/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-YARN-Build/10134/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91.txt JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10134/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Max memory used 76MB Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-YARN-Build/10134/console This message was automatically generated.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        kindly review the updated patch.

        Show
        rohithsharma Rohith Sharma K S added a comment - kindly review the updated patch.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Discussed offline with Jian He for sync up and for the solution. The summary is follows and respective patch updated.

        1. For failed attempt, while recovering need not to add to scheduler. Necessary changes is done at RMAppAttemptImpl
        2. If attempts are added to scheduler means attempts were running before RM restart.
        3. Any recovering attempts are added to new ordering policy pendingOrderingPolicyRecovery which is given higher preference than pendingOrderingPolicy while activating the applications.
        Show
        rohithsharma Rohith Sharma K S added a comment - Discussed offline with Jian He for sync up and for the solution. The summary is follows and respective patch updated. For failed attempt, while recovering need not to add to scheduler. Necessary changes is done at RMAppAttemptImpl If attempts are added to scheduler means attempts were running before RM restart. Any recovering attempts are added to new ordering policy pendingOrderingPolicyRecovery which is given higher preference than pendingOrderingPolicy while activating the applications.
        Hide
        jianhe Jian He added a comment -

        sorry, I missed this part, the recovered app need to be respected first only for the LeafQueue#pendingOrderingPolicy, right ? for LeafQueue#orderingPolicy, this is not needed.

        Reference test case TestRMRestart#testRMRestartAppRunningAMFailed

        I don't understand how this test case is related.

        Show
        jianhe Jian He added a comment - sorry, I missed this part, the recovered app need to be respected first only for the LeafQueue#pendingOrderingPolicy, right ? for LeafQueue#orderingPolicy, this is not needed. Reference test case TestRMRestart#testRMRestartAppRunningAMFailed I don't understand how this test case is related.
        Hide
        sunilg Sunil G added a comment -

        Sorry I didnt meant FairScheduler, I was trying to mention FairOrderingPolicy.

        Show
        sunilg Sunil G added a comment - Sorry I didnt meant FairScheduler, I was trying to mention FairOrderingPolicy .
        Hide
        sunilg Sunil G added a comment -

        HI Rohith Sharma K S
        This new fix will also introduce RecoveryComparator to FairOrderingPolicy too. Is it needed? I think it can be tracked separate after checkin whether same pblm will arise with FairScheduler.

        Show
        sunilg Sunil G added a comment - HI Rohith Sharma K S This new fix will also introduce RecoveryComparator to FairOrderingPolicy too. Is it needed? I think it can be tracked separate after checkin whether same pblm will arise with FairScheduler.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 0s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 3 new or modified test files.
        +1 mvninstall 7m 29s trunk passed
        +1 compile 0m 26s trunk passed with JDK v1.8.0_66
        +1 compile 0m 30s trunk passed with JDK v1.7.0_91
        +1 checkstyle 0m 16s trunk passed
        +1 mvnsite 0m 36s trunk passed
        +1 mvneclipse 0m 16s trunk passed
        +1 findbugs 1m 10s trunk passed
        +1 javadoc 0m 21s trunk passed with JDK v1.8.0_66
        +1 javadoc 0m 27s trunk passed with JDK v1.7.0_91
        +1 mvninstall 0m 30s the patch passed
        +1 compile 0m 24s the patch passed with JDK v1.8.0_66
        +1 javac 0m 23s the patch passed
        +1 compile 0m 27s the patch passed with JDK v1.7.0_91
        +1 javac 0m 27s the patch passed
        -1 checkstyle 0m 16s Patch generated 7 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager (total was 274, now 275).
        +1 mvnsite 0m 34s the patch passed
        +1 mvneclipse 0m 13s the patch passed
        +1 whitespace 0m 0s Patch has no whitespace issues.
        -1 findbugs 1m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager introduced 1 new FindBugs issues.
        +1 javadoc 0m 19s the patch passed with JDK v1.8.0_66
        +1 javadoc 0m 25s the patch passed with JDK v1.7.0_91
        -1 unit 63m 50s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66.
        -1 unit 64m 34s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91.
        +1 asflicense 0m 17s Patch does not generate ASF License warnings.
        145m 43s



        Reason Tests
        FindBugs module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
          org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.PriorityComparator implements Comparator but not Serializable At PriorityComparator.java:Serializable At PriorityComparator.java:[lines 26-34]
        JDK v1.8.0_66 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens
          hadoop.yarn.server.resourcemanager.TestAMAuthorization
        JDK v1.7.0_91 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens
          hadoop.yarn.server.resourcemanager.TestAMAuthorization



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:0ca8df7
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12779965/0003-YARN-4479.patch
        JIRA Issue YARN-4479
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 6acff4cee21e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / ad997fa
        Default Java 1.7.0_91
        Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/10124/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
        findbugs https://builds.apache.org/job/PreCommit-YARN-Build/10124/artifact/patchprocess/new-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.html
        unit https://builds.apache.org/job/PreCommit-YARN-Build/10124/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt
        unit https://builds.apache.org/job/PreCommit-YARN-Build/10124/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91.txt
        unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10124/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-YARN-Build/10124/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91.txt
        JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10124/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
        Max memory used 75MB
        Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/10124/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 3 new or modified test files. +1 mvninstall 7m 29s trunk passed +1 compile 0m 26s trunk passed with JDK v1.8.0_66 +1 compile 0m 30s trunk passed with JDK v1.7.0_91 +1 checkstyle 0m 16s trunk passed +1 mvnsite 0m 36s trunk passed +1 mvneclipse 0m 16s trunk passed +1 findbugs 1m 10s trunk passed +1 javadoc 0m 21s trunk passed with JDK v1.8.0_66 +1 javadoc 0m 27s trunk passed with JDK v1.7.0_91 +1 mvninstall 0m 30s the patch passed +1 compile 0m 24s the patch passed with JDK v1.8.0_66 +1 javac 0m 23s the patch passed +1 compile 0m 27s the patch passed with JDK v1.7.0_91 +1 javac 0m 27s the patch passed -1 checkstyle 0m 16s Patch generated 7 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager (total was 274, now 275). +1 mvnsite 0m 34s the patch passed +1 mvneclipse 0m 13s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. -1 findbugs 1m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager introduced 1 new FindBugs issues. +1 javadoc 0m 19s the patch passed with JDK v1.8.0_66 +1 javadoc 0m 25s the patch passed with JDK v1.7.0_91 -1 unit 63m 50s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. -1 unit 64m 34s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. +1 asflicense 0m 17s Patch does not generate ASF License warnings. 145m 43s Reason Tests FindBugs module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager   org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.PriorityComparator implements Comparator but not Serializable At PriorityComparator.java:Serializable At PriorityComparator.java: [lines 26-34] JDK v1.8.0_66 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization JDK v1.7.0_91 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12779965/0003-YARN-4479.patch JIRA Issue YARN-4479 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 6acff4cee21e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / ad997fa Default Java 1.7.0_91 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/10124/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt findbugs https://builds.apache.org/job/PreCommit-YARN-Build/10124/artifact/patchprocess/new-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.html unit https://builds.apache.org/job/PreCommit-YARN-Build/10124/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/10124/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10124/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-YARN-Build/10124/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91.txt JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10124/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Max memory used 75MB Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-YARN-Build/10124/console This message was automatically generated.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Attaching the updated patch, kindly review

        Show
        rohithsharma Rohith Sharma K S added a comment - Attaching the updated patch, kindly review
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        I think we can just relay on isAppRecovering flag which should be sufficient. And existing code in RMAppAttemptImpl can be there as-it-is(without patch). Only FAILED attempts are added to scheduler which will be removed in next event itself.

        Show
        rohithsharma Rohith Sharma K S added a comment - I think we can just relay on isAppRecovering flag which should be sufficient. And existing code in RMAppAttemptImpl can be there as-it-is(without patch). Only FAILED attempts are added to scheduler which will be removed in next event itself.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        For finished attempt, I think we do not need to re-add into scheduler, so this whole code could be removed.

        While recovering application and attempts, If the last attempt is FAILED then from scheduler transfer state from previous attempt. So, whenever there is failed attempt, attempt has to be added to scheduler for obtaining the state. Reference test case TestRMRestart#testRMRestartAppRunningAMFailed

        Show
        rohithsharma Rohith Sharma K S added a comment - For finished attempt, I think we do not need to re-add into scheduler, so this whole code could be removed. While recovering application and attempts, If the last attempt is FAILED then from scheduler transfer state from previous attempt. So, whenever there is failed attempt, attempt has to be added to scheduler for obtaining the state. Reference test case TestRMRestart#testRMRestartAppRunningAMFailed
        Hide
        jianhe Jian He added a comment -
        • For finished attempt, I think we do not need to re-add into scheduler, so this whole code could be removed.
                    if (EnumSet.of(RMAppAttemptState.RUNNING, RMAppAttemptState.LAUNCHED)
                        .contains(appAttempt.recoveredFinalState)) {
                      appAttempt.scheduler.handle(new AppAttemptAddedSchedulerEvent(
                          appAttempt.getAppAttemptId(), false, true, true));
                    } else {
                      appAttempt.scheduler.handle(new AppAttemptAddedSchedulerEvent(
                          appAttempt.getAppAttemptId(), false, true));
                    }
           

          Accordinlgy in BaseFinalTransition, this code need to be invoked if recoveredFinalState == null

           appAttempt.eventHandler.handle(new AppAttemptRemovedSchedulerEvent(
                  appAttemptId, finalAttemptState, keepContainersAcrossAppAttempts));
           
        • With above change, we can assume that attempt added into scheduler should be running, so the extra field wasAttemptRunning in AppAttemptAddedSchedulerEvent is not needed, the existing isAttemptRecovering flag should be enough.
        • I think Naganarasimha G R's suggestion make sense. we should consider FairComparator too. May be we can add a predefined comparator in AbstractComparatorOrderingPolicy with the recoveryComparator initialized and force underlying implementations to use this ?
        Show
        jianhe Jian He added a comment - For finished attempt, I think we do not need to re-add into scheduler, so this whole code could be removed. if (EnumSet.of(RMAppAttemptState.RUNNING, RMAppAttemptState.LAUNCHED) .contains(appAttempt.recoveredFinalState)) { appAttempt.scheduler.handle( new AppAttemptAddedSchedulerEvent( appAttempt.getAppAttemptId(), false , true , true )); } else { appAttempt.scheduler.handle( new AppAttemptAddedSchedulerEvent( appAttempt.getAppAttemptId(), false , true )); } Accordinlgy in BaseFinalTransition, this code need to be invoked if recoveredFinalState == null appAttempt.eventHandler.handle( new AppAttemptRemovedSchedulerEvent( appAttemptId, finalAttemptState, keepContainersAcrossAppAttempts)); With above change, we can assume that attempt added into scheduler should be running, so the extra field wasAttemptRunning in AppAttemptAddedSchedulerEvent is not needed, the existing isAttemptRecovering flag should be enough. I think Naganarasimha G R 's suggestion make sense. we should consider FairComparator too. May be we can add a predefined comparator in AbstractComparatorOrderingPolicy with the recoveryComparator initialized and force underlying implementations to use this ?
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        I had 2 options in doing this in fifoordering policy. I took simpler approach to make working patch. Further improvements like this will/can be addressed in coming patches once initial approach is agreed upon.

        Show
        rohithsharma Rohith Sharma K S added a comment - I had 2 options in doing this in fifoordering policy. I took simpler approach to make working patch. Further improvements like this will/can be addressed in coming patches once initial approach is agreed upon.
        Hide
        Naganarasimha Naganarasimha G R added a comment -

        Hi Rohith Sharma K S, Thanks for the patch,
        New approach seems to be better than the older as it tries to avoid additional data structure used for the same purpose, but few points :

        • If we consider for FairOrderingPolicy it first considers FairComparator and then the FifoComparator, so only if fairness is equal it will consider whether the application was already running, so would it be better to add additional comparator for recovery which can be used by both Fair and Fifo ?
        • So it will be totally left to Ordering policy whether to consider the order of the recovered app based on submission time or not, so better to get that documented so that custom ordering policy can consider it.
        Show
        Naganarasimha Naganarasimha G R added a comment - Hi Rohith Sharma K S , Thanks for the patch, New approach seems to be better than the older as it tries to avoid additional data structure used for the same purpose, but few points : If we consider for FairOrderingPolicy it first considers FairComparator and then the FifoComparator , so only if fairness is equal it will consider whether the application was already running, so would it be better to add additional comparator for recovery which can be used by both Fair and Fifo ? So it will be totally left to Ordering policy whether to consider the order of the recovered app based on submission time or not, so better to get that documented so that custom ordering policy can consider it.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 0s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 3 new or modified test files.
        +1 mvninstall 8m 5s trunk passed
        +1 compile 0m 33s trunk passed with JDK v1.8.0_66
        +1 compile 0m 36s trunk passed with JDK v1.7.0_91
        +1 checkstyle 0m 14s trunk passed
        +1 mvnsite 0m 43s trunk passed
        +1 mvneclipse 0m 18s trunk passed
        +1 findbugs 1m 18s trunk passed
        +1 javadoc 0m 26s trunk passed with JDK v1.8.0_66
        +1 javadoc 0m 30s trunk passed with JDK v1.7.0_91
        +1 mvninstall 0m 36s the patch passed
        +1 compile 0m 36s the patch passed with JDK v1.8.0_66
        +1 javac 0m 36s the patch passed
        +1 compile 0m 37s the patch passed with JDK v1.7.0_91
        +1 javac 0m 37s the patch passed
        -1 checkstyle 0m 14s Patch generated 10 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager (total was 373, now 379).
        +1 mvnsite 0m 39s the patch passed
        +1 mvneclipse 0m 15s the patch passed
        -1 whitespace 0m 0s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix.
        -1 whitespace 0m 0s The patch has 3 line(s) with tabs.
        +1 findbugs 1m 27s the patch passed
        +1 javadoc 0m 26s the patch passed with JDK v1.8.0_66
        -1 javadoc 3m 10s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91 with JDK v1.7.0_91 generated 1 new issues (was 2, now 3).
        +1 javadoc 0m 35s the patch passed with JDK v1.7.0_91
        -1 unit 65m 54s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66.
        -1 unit 66m 50s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91.
        -1 asflicense 0m 23s Patch generated 1 ASF License warnings.
        152m 28s



        Reason Tests
        JDK v1.8.0_66 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens
          hadoop.yarn.server.resourcemanager.TestAMAuthorization
        JDK v1.7.0_91 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens
          hadoop.yarn.server.resourcemanager.TestAMAuthorization



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:0ca8df7
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12779301/0002-YARN-4479.patch
        JIRA Issue YARN-4479
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux d516326ad86b 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / bb5df27
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/10084/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
        whitespace https://builds.apache.org/job/PreCommit-YARN-Build/10084/artifact/patchprocess/whitespace-eol.txt
        whitespace https://builds.apache.org/job/PreCommit-YARN-Build/10084/artifact/patchprocess/whitespace-tabs.txt
        javadoc hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91: https://builds.apache.org/job/PreCommit-YARN-Build/10084/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91.txt
        unit https://builds.apache.org/job/PreCommit-YARN-Build/10084/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt
        unit https://builds.apache.org/job/PreCommit-YARN-Build/10084/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91.txt
        unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10084/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-YARN-Build/10084/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91.txt
        JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10084/testReport/
        asflicense https://builds.apache.org/job/PreCommit-YARN-Build/10084/artifact/patchprocess/patch-asflicense-problems.txt
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
        Max memory used 76MB
        Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/10084/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 3 new or modified test files. +1 mvninstall 8m 5s trunk passed +1 compile 0m 33s trunk passed with JDK v1.8.0_66 +1 compile 0m 36s trunk passed with JDK v1.7.0_91 +1 checkstyle 0m 14s trunk passed +1 mvnsite 0m 43s trunk passed +1 mvneclipse 0m 18s trunk passed +1 findbugs 1m 18s trunk passed +1 javadoc 0m 26s trunk passed with JDK v1.8.0_66 +1 javadoc 0m 30s trunk passed with JDK v1.7.0_91 +1 mvninstall 0m 36s the patch passed +1 compile 0m 36s the patch passed with JDK v1.8.0_66 +1 javac 0m 36s the patch passed +1 compile 0m 37s the patch passed with JDK v1.7.0_91 +1 javac 0m 37s the patch passed -1 checkstyle 0m 14s Patch generated 10 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager (total was 373, now 379). +1 mvnsite 0m 39s the patch passed +1 mvneclipse 0m 15s the patch passed -1 whitespace 0m 0s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. -1 whitespace 0m 0s The patch has 3 line(s) with tabs. +1 findbugs 1m 27s the patch passed +1 javadoc 0m 26s the patch passed with JDK v1.8.0_66 -1 javadoc 3m 10s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91 with JDK v1.7.0_91 generated 1 new issues (was 2, now 3). +1 javadoc 0m 35s the patch passed with JDK v1.7.0_91 -1 unit 65m 54s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. -1 unit 66m 50s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. -1 asflicense 0m 23s Patch generated 1 ASF License warnings. 152m 28s Reason Tests JDK v1.8.0_66 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization JDK v1.7.0_91 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12779301/0002-YARN-4479.patch JIRA Issue YARN-4479 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux d516326ad86b 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / bb5df27 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/10084/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt whitespace https://builds.apache.org/job/PreCommit-YARN-Build/10084/artifact/patchprocess/whitespace-eol.txt whitespace https://builds.apache.org/job/PreCommit-YARN-Build/10084/artifact/patchprocess/whitespace-tabs.txt javadoc hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91: https://builds.apache.org/job/PreCommit-YARN-Build/10084/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/10084/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/10084/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10084/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-YARN-Build/10084/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91.txt JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10084/testReport/ asflicense https://builds.apache.org/job/PreCommit-YARN-Build/10084/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Max memory used 76MB Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-YARN-Build/10084/console This message was automatically generated.
        Hide
        Naganarasimha Naganarasimha G R added a comment -

        Thanks Sunil G,

        Its debatable and I think with discussion we can conclude the approach here.

        True its debate-able, but one more thing to be considered(/not missed) here A4 and A5 gets activated even before A2 (as per the correction i mentioned).

        All containers which were running earlier will still continue

        I mistook what you meant, it seems like you got what i wanted to mention.

        Show
        Naganarasimha Naganarasimha G R added a comment - Thanks Sunil G , Its debatable and I think with discussion we can conclude the approach here. True its debate-able, but one more thing to be considered(/not missed) here A4 and A5 gets activated even before A2 (as per the correction i mentioned). All containers which were running earlier will still continue I mistook what you meant, it seems like you got what i wanted to mention.
        Hide
        sunilg Sunil G added a comment -

        Thanks NGarla_Unused fr the comments.

        This patch tries to activate all applications which were running before RM restart happened

        Being said this, yes its definitely depends on available AM limit after restart (I meant the positive case in my earlier comment where all cluster resource were available).
        I did think about the case when some NMs are not registered back, and limit is lesser. In that case, we will have app-A1 pending in the list to get activated. And this application will be the one which will be activated first if any space is available. This ensures that high priority apps which were in the pending list will get containers, and app-A1 which were low in priority will wait. Even though A1 is activated, it has to wait till other high priority apps are done with its request. So A1 in pending list is may be fine provided other apps are completed sooner or failed NMs are up. But I am not saying its correct. Its debatable and I think with discussion we can conclude the approach here.

        Also abt All containers which were running earlier will still continue, I meant about the live containers of apps which were running prior to restart. After restart, even for the pending apps (apps like A1) as mentioned in ur scenario, its running containers wont be killed. Am I missing something?

        Show
        sunilg Sunil G added a comment - Thanks NGarla_Unused fr the comments. This patch tries to activate all applications which were running before RM restart happened Being said this, yes its definitely depends on available AM limit after restart (I meant the positive case in my earlier comment where all cluster resource were available). I did think about the case when some NMs are not registered back, and limit is lesser. In that case, we will have app-A1 pending in the list to get activated. And this application will be the one which will be activated first if any space is available. This ensures that high priority apps which were in the pending list will get containers, and app-A1 which were low in priority will wait. Even though A1 is activated, it has to wait till other high priority apps are done with its request. So A1 in pending list is may be fine provided other apps are completed sooner or failed NMs are up. But I am not saying its correct. Its debatable and I think with discussion we can conclude the approach here. Also abt All containers which were running earlier will still continue , I meant about the live containers of apps which were running prior to restart. After restart, even for the pending apps (apps like A1) as mentioned in ur scenario, its running containers wont be killed. Am I missing something?
        Hide
        Naganarasimha Naganarasimha G R added a comment -

        small correction in the example : A1 = 8GB , A2 = 2GB, A3 = 2GB, A4 = 2Gb, A5 =2Gb => A1 = 2GB , A2 = 8GB, A3 = 2GB, A4 = 2Gb, A5 =2Gb

        Show
        Naganarasimha Naganarasimha G R added a comment - small correction in the example : A1 = 8GB , A2 = 2GB, A3 = 2GB, A4 = 2Gb, A5 =2Gb => A1 = 2GB , A2 = 8GB, A3 = 2GB, A4 = 2Gb, A5 =2Gb
        Hide
        Naganarasimha Naganarasimha G R added a comment -

        Thanks for the comments Sunil G & Rohith Sharma K S,

        This patch tries to activate all applications which were running before RM restart happened.

        IIUC the patch, it goes through the existing flow hence all applications will not be activated by default but only if queue's AM resource limit is available, app will get activated.

        2. All containers which were running earlier will still continue,

        To elaborate further Based on the scenario which i had mentioned, Assume queue capacity is 120GB (for simplicity), and AM resource limit is 10%(=12GB) and AM resource : A1 = 8GB , A2 = 2GB, A3 = 2GB, A4 = 2Gb, A5 =2Gb. After recovery assume all nodes are not up and only 100 Gb is available So as per the code in patch A3, A2, A4 & A5 will get activated (8GB) and A1 will not get activated though the app is running. Correct me if my understanding is wrong

        Being said all this points, I also feel that we may need to add more complex code to keep the same order as you proposed. So if there are no major impacts, I think the approach taken in this patch looks fine. Thoughts?

        IIUC point 1 is same as with or without the patch so no issues, point 2 IIUC your assumption is wrong. All containers which were running earlier will still continue
        But the approach to the scenario which i mentioned is debate able, if it introduces too much complexity then we can skip but just wanted to share the scenario, as i said current approach is fine except for the scenario mentioned.

        few nits/query in the patch

        @@ -607,9 +612,24 @@ private synchronized void activateApplications() {
             Map<String, Resource> userAmPartitionLimit =
                 new HashMap<String, Resource>();
         
        -    for (Iterator<FiCaSchedulerApp> i = getPendingAppsOrderingPolicy()
        -        .getAssignmentIterator(); i.hasNext();) {
        -      FiCaSchedulerApp application = i.next();
        +    for (Iterator<FiCaSchedulerApp> i =
        +        getPendingAppsOrderingPolicyRecovery().getAssignmentIterator(); i
        +        .hasNext();) {
        +      activateApplications(i, amPartitionLimit, userAmPartitionLimit);
        +    }
        +
        

        Is for loop required here as we are looping the iterator in overloaded activateApplications(fsApp, amPartitionLimit, userAmPartitionLimit)

        Show
        Naganarasimha Naganarasimha G R added a comment - Thanks for the comments Sunil G & Rohith Sharma K S , This patch tries to activate all applications which were running before RM restart happened. IIUC the patch, it goes through the existing flow hence all applications will not be activated by default but only if queue's AM resource limit is available, app will get activated. 2. All containers which were running earlier will still continue, To elaborate further Based on the scenario which i had mentioned, Assume queue capacity is 120GB (for simplicity), and AM resource limit is 10%(=12GB) and AM resource : A1 = 8GB , A2 = 2GB, A3 = 2GB, A4 = 2Gb, A5 =2Gb. After recovery assume all nodes are not up and only 100 Gb is available So as per the code in patch A3, A2, A4 & A5 will get activated (8GB) and A1 will not get activated though the app is running. Correct me if my understanding is wrong Being said all this points, I also feel that we may need to add more complex code to keep the same order as you proposed. So if there are no major impacts, I think the approach taken in this patch looks fine. Thoughts? IIUC point 1 is same as with or without the patch so no issues, point 2 IIUC your assumption is wrong. All containers which were running earlier will still continue But the approach to the scenario which i mentioned is debate able, if it introduces too much complexity then we can skip but just wanted to share the scenario, as i said current approach is fine except for the scenario mentioned. few nits/query in the patch @@ -607,9 +612,24 @@ private synchronized void activateApplications() { Map< String , Resource> userAmPartitionLimit = new HashMap< String , Resource>(); - for (Iterator<FiCaSchedulerApp> i = getPendingAppsOrderingPolicy() - .getAssignmentIterator(); i.hasNext();) { - FiCaSchedulerApp application = i.next(); + for (Iterator<FiCaSchedulerApp> i = + getPendingAppsOrderingPolicyRecovery().getAssignmentIterator(); i + .hasNext();) { + activateApplications(i, amPartitionLimit, userAmPartitionLimit); + } + Is for loop required here as we are looping the iterator in overloaded activateApplications(fsApp, amPartitionLimit, userAmPartitionLimit)
        Hide
        sunilg Sunil G added a comment -

        Hi Naganarasimha G R

        IUC Sunil G also wanted to say the same ?

        I meant in slight different way. With existing approach, any application which was running earlier to RM restart will also be in RUNNING state. It may starve as per the scenario which you and Rohith mentioned, but the state of application will be running.

        Also adding to the existing discussion, I would like to point out few things.
        This patch tries to activate all applications which were running before RM restart happened. It may be getting activated with different order, but it is trying to put all these apps in the activated list of scheduler (App state will be RUNNING still).
        1. After restart with or without ordering, only highest priority app will be selected for scheduling from activated list. This same behavior is happening before RM restart also. SO there seems no impact with this. Pls correct me if I am wrong.
        2. All containers which were running earlier will still continue, and all pending requests will be updated/refreshed and this is from ApplicationMasterService thread. So if all earlier running apps are activated,then behavior will be same from scheduler end, correct?

        Being said all this points, I also feel that we may need to add more complex code to keep the same order as you proposed. So if there are no major impacts, I think the approach taken in this patch looks fine. Thoughts?

        Show
        sunilg Sunil G added a comment - Hi Naganarasimha G R IUC Sunil G also wanted to say the same ? I meant in slight different way. With existing approach, any application which was running earlier to RM restart will also be in RUNNING state. It may starve as per the scenario which you and Rohith mentioned, but the state of application will be running. Also adding to the existing discussion, I would like to point out few things. This patch tries to activate all applications which were running before RM restart happened. It may be getting activated with different order, but it is trying to put all these apps in the activated list of scheduler (App state will be RUNNING still). 1. After restart with or without ordering, only highest priority app will be selected for scheduling from activated list. This same behavior is happening before RM restart also. SO there seems no impact with this. Pls correct me if I am wrong. 2. All containers which were running earlier will still continue, and all pending requests will be updated/refreshed and this is from ApplicationMasterService thread. So if all earlier running apps are activated,then behavior will be same from scheduler end, correct? Being said all this points, I also feel that we may need to add more complex code to keep the same order as you proposed. So if there are no major impacts, I think the approach taken in this patch looks fine. Thoughts?
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Without RM restart also now apps can be starved if AMLimit is reached.

        Show
        rohithsharma Rohith Sharma K S added a comment - Without RM restart also now apps can be starved if AMLimit is reached.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 0s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
        +1 mvninstall 7m 58s trunk passed
        +1 compile 0m 30s trunk passed with JDK v1.8.0_66
        +1 compile 0m 33s trunk passed with JDK v1.7.0_91
        +1 checkstyle 0m 14s trunk passed
        +1 mvnsite 0m 39s trunk passed
        +1 mvneclipse 0m 16s trunk passed
        +1 findbugs 1m 16s trunk passed
        +1 javadoc 0m 24s trunk passed with JDK v1.8.0_66
        +1 javadoc 0m 28s trunk passed with JDK v1.7.0_91
        +1 mvninstall 0m 36s the patch passed
        +1 compile 0m 29s the patch passed with JDK v1.8.0_66
        +1 javac 0m 29s the patch passed
        +1 compile 0m 32s the patch passed with JDK v1.7.0_91
        +1 javac 0m 32s the patch passed
        -1 checkstyle 0m 14s Patch generated 10 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager (total was 321, now 325).
        +1 mvnsite 0m 40s the patch passed
        +1 mvneclipse 0m 16s the patch passed
        -1 whitespace 0m 0s The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix.
        -1 whitespace 0m 0s The patch has 3 line(s) with tabs.
        -1 findbugs 1m 29s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager introduced 2 new FindBugs issues.
        +1 javadoc 0m 23s the patch passed with JDK v1.8.0_66
        +1 javadoc 0m 30s the patch passed with JDK v1.7.0_91
        -1 unit 65m 34s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66.
        -1 unit 66m 29s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91.
        +1 asflicense 0m 23s Patch does not generate ASF License warnings.
        150m 57s



        Reason Tests
        FindBugs module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
          Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.pendingOrderingPolicy; locked 87% of time Unsynchronized access at LeafQueue.java:87% of time Unsynchronized access at LeafQueue.java:[line 1534]
          Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.pendingOrderingPolicyRecovery; locked 87% of time Unsynchronized access at LeafQueue.java:87% of time Unsynchronized access at LeafQueue.java:[line 1536]
        JDK v1.8.0_66 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens
          hadoop.yarn.server.resourcemanager.TestAMAuthorization
        JDK v1.7.0_91 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens
          hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions
          hadoop.yarn.server.resourcemanager.TestAMAuthorization



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:0ca8df7
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12778805/0001-YARN-4479.patch
        JIRA Issue YARN-4479
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux db24c80ec6a7 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 52ad912
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/10055/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
        whitespace https://builds.apache.org/job/PreCommit-YARN-Build/10055/artifact/patchprocess/whitespace-eol.txt
        whitespace https://builds.apache.org/job/PreCommit-YARN-Build/10055/artifact/patchprocess/whitespace-tabs.txt
        findbugs https://builds.apache.org/job/PreCommit-YARN-Build/10055/artifact/patchprocess/new-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.html
        unit https://builds.apache.org/job/PreCommit-YARN-Build/10055/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt
        unit https://builds.apache.org/job/PreCommit-YARN-Build/10055/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91.txt
        unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10055/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-YARN-Build/10055/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91.txt
        JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10055/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
        Max memory used 75MB
        Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/10055/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 7m 58s trunk passed +1 compile 0m 30s trunk passed with JDK v1.8.0_66 +1 compile 0m 33s trunk passed with JDK v1.7.0_91 +1 checkstyle 0m 14s trunk passed +1 mvnsite 0m 39s trunk passed +1 mvneclipse 0m 16s trunk passed +1 findbugs 1m 16s trunk passed +1 javadoc 0m 24s trunk passed with JDK v1.8.0_66 +1 javadoc 0m 28s trunk passed with JDK v1.7.0_91 +1 mvninstall 0m 36s the patch passed +1 compile 0m 29s the patch passed with JDK v1.8.0_66 +1 javac 0m 29s the patch passed +1 compile 0m 32s the patch passed with JDK v1.7.0_91 +1 javac 0m 32s the patch passed -1 checkstyle 0m 14s Patch generated 10 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager (total was 321, now 325). +1 mvnsite 0m 40s the patch passed +1 mvneclipse 0m 16s the patch passed -1 whitespace 0m 0s The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix. -1 whitespace 0m 0s The patch has 3 line(s) with tabs. -1 findbugs 1m 29s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager introduced 2 new FindBugs issues. +1 javadoc 0m 23s the patch passed with JDK v1.8.0_66 +1 javadoc 0m 30s the patch passed with JDK v1.7.0_91 -1 unit 65m 34s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. -1 unit 66m 29s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. +1 asflicense 0m 23s Patch does not generate ASF License warnings. 150m 57s Reason Tests FindBugs module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager   Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.pendingOrderingPolicy; locked 87% of time Unsynchronized access at LeafQueue.java:87% of time Unsynchronized access at LeafQueue.java: [line 1534]   Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.pendingOrderingPolicyRecovery; locked 87% of time Unsynchronized access at LeafQueue.java:87% of time Unsynchronized access at LeafQueue.java: [line 1536] JDK v1.8.0_66 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization JDK v1.7.0_91 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions   hadoop.yarn.server.resourcemanager.TestAMAuthorization Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12778805/0001-YARN-4479.patch JIRA Issue YARN-4479 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux db24c80ec6a7 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 52ad912 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/10055/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt whitespace https://builds.apache.org/job/PreCommit-YARN-Build/10055/artifact/patchprocess/whitespace-eol.txt whitespace https://builds.apache.org/job/PreCommit-YARN-Build/10055/artifact/patchprocess/whitespace-tabs.txt findbugs https://builds.apache.org/job/PreCommit-YARN-Build/10055/artifact/patchprocess/new-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.html unit https://builds.apache.org/job/PreCommit-YARN-Build/10055/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/10055/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10055/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-YARN-Build/10055/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91.txt JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10055/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Max memory used 75MB Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-YARN-Build/10055/console This message was automatically generated.
        Hide
        Naganarasimha Naganarasimha G R added a comment -

        In Most cases this would be sufficient, but consider a case where in A3 is an app with large number of containers and A1 and A2 are short jobs. May be after recovery all nodes have not registered, due to AM resource limits A1 and/or A2 AM's resource if activated will exceed AM's resource limits, so only A3 will be running. Problem here is : there is no configurable ordering policy for recovered apps seperately, it applies the same policy so in rare cases it might lead to starvation ?

        Show
        Naganarasimha Naganarasimha G R added a comment - In Most cases this would be sufficient, but consider a case where in A3 is an app with large number of containers and A1 and A2 are short jobs. May be after recovery all nodes have not registered, due to AM resource limits A1 and/or A2 AM's resource if activated will exceed AM's resource limits, so only A3 will be running. Problem here is : there is no configurable ordering policy for recovered apps seperately, it applies the same policy so in rare cases it might lead to starvation ?
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        but wanted to know whether the order of activation should be A1, A2, A3 itself and not based on the ordering policy for the recovered apps, Thoughts ?

        It should be based on the ordering policy only. At this stage, all 3 applications A1,A2 and A3 are in same level of position. So activation should be based on the ordering policy implementation. In specific to priority, always highest priority application should be activated first.

        Show
        rohithsharma Rohith Sharma K S added a comment - but wanted to know whether the order of activation should be A1, A2, A3 itself and not based on the ordering policy for the recovered apps, Thoughts ? It should be based on the ordering policy only. At this stage, all 3 applications A1,A2 and A3 are in same level of position. So activation should be based on the ordering policy implementation. In specific to priority, always highest priority application should be activated first.
        Hide
        Naganarasimha Naganarasimha G R added a comment -

        Hi Rohith Sharma K S,
        As discussed offline,
        Assume b4 recovery Apps which were activated : A1(Low),A2(Low),A3(Medium) and pending were A4(High) & A5(High)
        Based on the current approach applications will be activated in the order A4,A5,A3,A1,A2
        After your patch it will be A3,A1,A2,A4,A5
        So in a way its better than the existing approach but wanted to know whether the order of activation should be A1, A2, A3 itself and not based on the ordering policy for the recovered apps, Thoughts ?
        IIUC Sunil G also wanted to say the same ?

        Show
        Naganarasimha Naganarasimha G R added a comment - Hi Rohith Sharma K S , As discussed offline, Assume b4 recovery Apps which were activated : A1 (Low), A2 (Low), A3 (Medium) and pending were A4 (High) & A5 (High) Based on the current approach applications will be activated in the order A4,A5,A3,A1,A2 After your patch it will be A3,A1,A2,A4,A5 So in a way its better than the existing approach but wanted to know whether the order of activation should be A1, A2, A3 itself and not based on the ordering policy for the recovered apps, Thoughts ? IIUC Sunil G also wanted to say the same ?
        Hide
        sunilg Sunil G added a comment -

        Adding to this,
        Application which is in RUNNING state prior to restart will still be in RUNNING state. However, it will NOT be getting any new containers until its been made active again when other high priority applications are completed.

        Patch generally looks fine, I will give some comments after checking patch more detail.

        Show
        sunilg Sunil G added a comment - Adding to this, Application which is in RUNNING state prior to restart will still be in RUNNING state. However, it will NOT be getting any new containers until its been made active again when other high priority applications are completed. Patch generally looks fine, I will give some comments after checking patch more detail.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        The patch does following things

        1. During attempt recovery, adds new flag which tells scheduler that was it RUNNING/LAUNCHED before RM restart
        2. For all the attempts during recovery, previous running attempts are added to separate pendingApplicationOrdering policy which is used to track applications which are running before RMRestart.
        3. When a node registers, activate first which are running before RMRestart.
        Show
        rohithsharma Rohith Sharma K S added a comment - The patch does following things During attempt recovery, adds new flag which tells scheduler that was it RUNNING/LAUNCHED before RM restart For all the attempts during recovery, previous running attempts are added to separate pendingApplicationOrdering policy which is used to track applications which are running before RMRestart. When a node registers, activate first which are running before RMRestart.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Attaching the patch , kindly review..

        Show
        rohithsharma Rohith Sharma K S added a comment - Attaching the patch , kindly review..
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Scenario which cause issue is

        1. Submitted the app-1 and app-2 with priority 5. Both applications are activated and RUNNING state.
        2. Submit app-3 with priority 6. This application is in pending state because of AMLimit.
        3. RM restarted, app-1 application is activated(it is behavior that for 1st application AMLimit is not considered) and app-2 and app-3 are in pendingOrderingPolicy
        4. AM re-registered for app-1 and app-2. Its state is now RUNNING. But app-2 and app-3 are still in pendingapplications.
        5. NodeManager re-registered with RM. As a result 1 application supposed to be get activated. Here, always app-3 get activated since app-3 priority is higher, but app-2 should get activated first since it is running before RMrestart.
        Show
        rohithsharma Rohith Sharma K S added a comment - Scenario which cause issue is Submitted the app-1 and app-2 with priority 5. Both applications are activated and RUNNING state. Submit app-3 with priority 6. This application is in pending state because of AMLimit. RM restarted, app-1 application is activated(it is behavior that for 1st application AMLimit is not considered) and app-2 and app-3 are in pendingOrderingPolicy AM re-registered for app-1 and app-2. Its state is now RUNNING. But app-2 and app-3 are still in pendingapplications. NodeManager re-registered with RM. As a result 1 application supposed to be get activated. Here, always app-3 get activated since app-3 priority is higher, but app-2 should get activated first since it is running before RMrestart.
        Hide
        sunilg Sunil G added a comment -

        Thanks Rohith Sharma K S for raising this. As discussed offline, all running attempts during recovery (attempts state NOT in final states and NOT null) could be considered for this list. Such apps can be made activated first just to be in sync for the state before recovery. Now also its possible that we may get starvation but it will be same as what it was happening before HA.

        Show
        sunilg Sunil G added a comment - Thanks Rohith Sharma K S for raising this. As discussed offline, all running attempts during recovery (attempts state NOT in final states and NOT null) could be considered for this list. Such apps can be made activated first just to be in sync for the state before recovery. Now also its possible that we may get starvation but it will be same as what it was happening before HA.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Thinking while recovering the applications, previous running applications should be added to separate ordering policy like pendingOrderingPolicyRecovery and use this while activating applications prior to pendingOrderingPolicy iteration.

        Show
        rohithsharma Rohith Sharma K S added a comment - Thinking while recovering the applications, previous running applications should be added to separate ordering policy like pendingOrderingPolicyRecovery and use this while activating applications prior to pendingOrderingPolicy iteration.

          People

          • Assignee:
            rohithsharma Rohith Sharma K S
            Reporter:
            rohithsharma Rohith Sharma K S
          • Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development