Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-2865

Application recovery continuously fails with "Application with id already present. Cannot duplicate"

    Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      YARN-2588 handles exception thrown while transitioningToActive and reset activeServices. But it misses out clearing RMcontext apps/nodes details and ClusterMetrics and QueueMetrics. This causes application recovery to fail.

      1. YARN-2865.1.patch
        38 kB
        Rohith Sharma K S
      2. YARN-2865.patch
        39 kB
        Rohith Sharma K S
      3. YARN-2865.patch
        5 kB
        Rohith Sharma K S

        Activity

        Hide
        rohithsharma Rohith Sharma K S added a comment -

        I encountered this scenario in my test cluster strange way!! causing below exception.

        2014-11-14 04:11:33,433 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: Recovering 2 applications
        2014-11-14 04:11:33,433 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: Default priority level is set to application:application_1415591025732_0001
        2014-11-14 04:11:33,433 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: Application with id application_1415591025732_0001 is already present! Cannot add a duplicate!
        2014-11-14 04:11:33,433 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Failed to load/recover state
        org.apache.hadoop.yarn.exceptions.YarnException: Application with id application_1415591025732_0001 is already present! Cannot add a duplicate!
                        at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:45)
                        at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:364)
                        at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:332)
                        at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:436)
                        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1146)
                        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:521)
                        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
                        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:925)
                        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:966)
                        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:962)
                        at java.security.AccessController.doPrivileged(Native Method)
                        at javax.security.auth.Subject.doAs(Subject.java:415)
                        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1612)
                        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:962)
                        at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:281)
                        at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:126)
                        at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:805)
                        at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:416)
                        at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:602)
                        at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
        
        Show
        rohithsharma Rohith Sharma K S added a comment - I encountered this scenario in my test cluster strange way!! causing below exception. 2014-11-14 04:11:33,433 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: Recovering 2 applications 2014-11-14 04:11:33,433 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: Default priority level is set to application:application_1415591025732_0001 2014-11-14 04:11:33,433 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: Application with id application_1415591025732_0001 is already present! Cannot add a duplicate! 2014-11-14 04:11:33,433 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Failed to load/recover state org.apache.hadoop.yarn.exceptions.YarnException: Application with id application_1415591025732_0001 is already present! Cannot add a duplicate! at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:45) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:364) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:332) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:436) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1146) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:521) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:925) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:966) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:962) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1612) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:962) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:281) at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:126) at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:805) at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:416) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:602) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Attaching the patch that clears rmcontext,cluster metric and queue metric. And also I have done refactoring of common methods were called from transitionToActive and transitionToStandBy.
        Please review the patch.

        Show
        rohithsharma Rohith Sharma K S added a comment - Attaching the patch that clears rmcontext,cluster metric and queue metric. And also I have done refactoring of common methods were called from transitionToActive and transitionToStandBy. Please review the patch.
        Hide
        jianhe Jian He added a comment -

        hi Rohith, one question, why does the rmContext still contain the application? If the RM were at standby mode, the transitionToStandby should have cleaned the rmContext up ?

        Show
        jianhe Jian He added a comment - hi Rohith, one question, why does the rmContext still contain the application? If the RM were at standby mode, the transitionToStandby should have cleaned the rmContext up ?
        Hide
        hadoopqa Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12681615/YARN-2865.patch
        against trunk revision 49c3889.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5846//testReport/
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5846//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12681615/YARN-2865.patch against trunk revision 49c3889. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5846//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5846//console This message is automatically generated.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        why does the rmContext still contain the application?If the RM were at standby mode, the transitionToStandby should have cleaned the rmContext up ?

        I agree in positive flow. What if trainsitionToActive throw exception after recovery is succeeded?? Recovery process adds back applications to RMContext in RMAppManager. Any service start failures occur after recovery is completed then RMContext remain with stale applications.
        Consider the below scenario execution

        1. RM is in StandBy state. Initial state is STANDBY
        2. STANDBY to ACTIVE :
          1. Recovery : All application recovery is success. RMContext has recovered applications in it.
          2. Any active service start failed which throw exception back.
            RM state remain STANDBY. But here, exception handling is done i.e. only dispatcher has been reset, but not rmcontext/metrics system. Currently, it is done at stopActiveService ()
        3. STANDBY to ACTIVE : recovery fails with above exception. And it never move to ACTIVE in further transtitionToActive command from elector unless RM gets command to STANDBY to STANDBY and next STANDBY to ACTIVE.
        Show
        rohithsharma Rohith Sharma K S added a comment - why does the rmContext still contain the application?If the RM were at standby mode, the transitionToStandby should have cleaned the rmContext up ? I agree in positive flow. What if trainsitionToActive throw exception after recovery is succeeded?? Recovery process adds back applications to RMContext in RMAppManager. Any service start failures occur after recovery is completed then RMContext remain with stale applications. Consider the below scenario execution RM is in StandBy state. Initial state is STANDBY STANDBY to ACTIVE : Recovery : All application recovery is success. RMContext has recovered applications in it. Any active service start failed which throw exception back. RM state remain STANDBY. But here, exception handling is done i.e. only dispatcher has been reset, but not rmcontext/metrics system. Currently, it is done at stopActiveService () STANDBY to ACTIVE : recovery fails with above exception. And it never move to ACTIVE in further transtitionToActive command from elector unless RM gets command to STANDBY to STANDBY and next STANDBY to ACTIVE.
        Hide
        jianhe Jian He added a comment -

        make sense, thanks for your explanation ! will review the patch

        Show
        jianhe Jian He added a comment - make sense, thanks for your explanation ! will review the patch
        Hide
        jianhe Jian He added a comment -

        patch looks good overall.
        The patch should be enough to fix this jira. In addition, we need to clear "systemCredentials, schedulerRecoveryStartTime, schedulerRecoveryWaitTime etc.. " too.. probably we should have a separate class including all the active services' context and instantiate a new class when transitioning from standby to active for future-proof, instead of reseting each individual field. thoughts? we can definitely do this separately

        Show
        jianhe Jian He added a comment - patch looks good overall. The patch should be enough to fix this jira. In addition, we need to clear "systemCredentials, schedulerRecoveryStartTime, schedulerRecoveryWaitTime etc.. " too.. probably we should have a separate class including all the active services' context and instantiate a new class when transitioning from standby to active for future-proof, instead of reseting each individual field. thoughts? we can definitely do this separately
        Hide
        kasha Karthik Kambatla added a comment -

        probably we should have a separate class including all the active services' context and instantiate a new class when transitioning from standby to active

        +1

        Show
        kasha Karthik Kambatla added a comment - probably we should have a separate class including all the active services' context and instantiate a new class when transitioning from standby to active +1
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        In addition, we need to clear "systemCredentials, schedulerRecoveryStartTime, schedulerRecoveryWaitTime etc.. " too

        Agree

        probably we should have a separate class including all the active services' context and instantiate a new class when transitioning from standby to active

        +1, Is it YARN-1874 ?

        Show
        rohithsharma Rohith Sharma K S added a comment - In addition, we need to clear "systemCredentials, schedulerRecoveryStartTime, schedulerRecoveryWaitTime etc.. " too Agree probably we should have a separate class including all the active services' context and instantiate a new class when transitioning from standby to active +1, Is it YARN-1874 ?
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Is it YARN-1874 ?

        This is different issue that move RMActiveService from ResourceManager, ignore it.

        Show
        rohithsharma Rohith Sharma K S added a comment - Is it YARN-1874 ? This is different issue that move RMActiveService from ResourceManager, ignore it.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Attached patch that separates RMContext and RMActiveServiceContext given with minimal changes.

        1. Retain RMContext interface. Add RMActiveServiceContext that owns the ActiveServices details.ActiveService services are accessed via activeServiceContext .getXXXXX() and activeServiceContext.setXXXXX() in RMContextImpl.
          • RMContext : rmDispatcher,isHAEnabled,haServiceState,adminService,configurationProvider,activeServiceContext
          • RMActiveServiceContext : other then previous line field variables from RMContext(like stateStore).
        Show
        rohithsharma Rohith Sharma K S added a comment - Attached patch that separates RMContext and RMActiveServiceContext given with minimal changes. Retain RMContext interface. Add RMActiveServiceContext that owns the ActiveServices details.ActiveService services are accessed via activeServiceContext .getXXXXX() and activeServiceContext.setXXXXX() in RMContextImpl. RMContext : rmDispatcher,isHAEnabled,haServiceState,adminService,configurationProvider,activeServiceContext RMActiveServiceContext : other then previous line field variables from RMContext(like stateStore).
        Hide
        hadoopqa Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12682163/YARN-2865.patch
        against trunk revision 9dd5d67.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5866//testReport/
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5866//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12682163/YARN-2865.patch against trunk revision 9dd5d67. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5866//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5866//console This message is automatically generated.
        Hide
        kasha Karthik Kambatla added a comment -

        Patch looks mostly good. Minor comments - we should reduce the visibility of the class and its methods to package-private, mark it @Private @Unstable, and add comments that this class is expected to be used only by RMContext and ResourceManager. I just want to guard against new code using this instead of RMContext; we might want this to be accessible in the future, but we should probably keep the changes small in this JIRA.

        Show
        kasha Karthik Kambatla added a comment - Patch looks mostly good. Minor comments - we should reduce the visibility of the class and its methods to package-private, mark it @Private @Unstable, and add comments that this class is expected to be used only by RMContext and ResourceManager. I just want to guard against new code using this instead of RMContext; we might want this to be accessible in the future, but we should probably keep the changes small in this JIRA.
        Hide
        jianhe Jian He added a comment -

        looks good to me too. minor thing:
        In RMActiveServices, some are using rmContext#setter, some are using activeServiceContext#setter, we may make it consistent to use the latter

        Show
        jianhe Jian He added a comment - looks good to me too. minor thing: In RMActiveServices, some are using rmContext#setter , some are using activeServiceContext#setter , we may make it consistent to use the latter
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Thanks Karthik and Jian He for review. I will update the patch.

        In RMActiveServices, some are using rmContext#setter, some are using activeServiceContext#setter, we may make it consistent to use the latter

        RMContext has 5 setter methods. I used those methods to set from RMActiveService just to retain interface implementation.

        Show
        rohithsharma Rohith Sharma K S added a comment - Thanks Karthik and Jian He for review. I will update the patch. In RMActiveServices, some are using rmContext#setter, some are using activeServiceContext#setter, we may make it consistent to use the latter RMContext has 5 setter methods. I used those methods to set from RMActiveService just to retain interface implementation.
        Hide
        ozawa Tsuyoshi Ozawa added a comment -

        Rohith Sharma K S, thanks for taking this issue. I'd like to +1 for adding Private and Unstable annotation to the methods defined in RMActiveServiceContext as Karthik mentioned.

        Otherwise points looks good to me.

        Show
        ozawa Tsuyoshi Ozawa added a comment - Rohith Sharma K S , thanks for taking this issue. I'd like to +1 for adding Private and Unstable annotation to the methods defined in RMActiveServiceContext as Karthik mentioned. Otherwise points looks good to me.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Attached patch, the changes from previous patch are
        1. Karthik comment fixed. Adding comment for RMActiveServiceContext and making @private and @unstable annotations.
        2. Jian He comment fixed. I use rmcontext only to set services.

        Please review the patch.

        Show
        rohithsharma Rohith Sharma K S added a comment - Attached patch, the changes from previous patch are 1. Karthik comment fixed. Adding comment for RMActiveServiceContext and making @private and @unstable annotations. 2. Jian He comment fixed. I use rmcontext only to set services. Please review the patch.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12682428/YARN-2865.1.patch
        against trunk revision 5bd048e.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

        org.apache.hadoop.yarn.server.resourcemanager.TestMoveApplication

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5878//testReport/
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5878//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12682428/YARN-2865.1.patch against trunk revision 5bd048e. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestMoveApplication +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5878//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5878//console This message is automatically generated.
        Hide
        jianhe Jian He added a comment -

        lgtm, test failures looks unrelated, re-kick jenkins

        Show
        jianhe Jian He added a comment - lgtm, test failures looks unrelated, re-kick jenkins
        Hide
        hadoopqa Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12682428/YARN-2865.1.patch
        against trunk revision 73348a4.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5883//testReport/
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5883//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12682428/YARN-2865.1.patch against trunk revision 73348a4. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5883//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5883//console This message is automatically generated.
        Hide
        jianhe Jian He added a comment -

        Committed to trunk and branch-2. thanks Rohith !

        Thanks Karthik and Tsuyoshi for review and comments!

        Show
        jianhe Jian He added a comment - Committed to trunk and branch-2. thanks Rohith ! Thanks Karthik and Tsuyoshi for review and comments!
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #6577 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6577/)
        YARN-2865. Fixed RM to always create a new RMContext when transtions from StandBy to Active. Contributed by Rohith Sharmaks (jianhe: rev 9cb8b75ba57f18639492bfa3b7e7c11c00bb3d3b)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #6577 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6577/ ) YARN-2865 . Fixed RM to always create a new RMContext when transtions from StandBy to Active. Contributed by Rohith Sharmaks (jianhe: rev 9cb8b75ba57f18639492bfa3b7e7c11c00bb3d3b) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java hadoop-yarn-project/CHANGES.txt
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Thanks Karthik JianHe and Tsuyoshi for your reviews.

        Show
        rohithsharma Rohith Sharma K S added a comment - Thanks Karthik JianHe and Tsuyoshi for your reviews.
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #11 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/11/)
        YARN-2865. Fixed RM to always create a new RMContext when transtions from StandBy to Active. Contributed by Rohith Sharmaks (jianhe: rev 9cb8b75ba57f18639492bfa3b7e7c11c00bb3d3b)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #11 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/11/ ) YARN-2865 . Fixed RM to always create a new RMContext when transtions from StandBy to Active. Contributed by Rohith Sharmaks (jianhe: rev 9cb8b75ba57f18639492bfa3b7e7c11c00bb3d3b) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java hadoop-yarn-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk #749 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/749/)
        YARN-2865. Fixed RM to always create a new RMContext when transtions from StandBy to Active. Contributed by Rohith Sharmaks (jianhe: rev 9cb8b75ba57f18639492bfa3b7e7c11c00bb3d3b)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #749 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/749/ ) YARN-2865 . Fixed RM to always create a new RMContext when transtions from StandBy to Active. Contributed by Rohith Sharmaks (jianhe: rev 9cb8b75ba57f18639492bfa3b7e7c11c00bb3d3b) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java hadoop-yarn-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #11 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/11/)
        YARN-2865. Fixed RM to always create a new RMContext when transtions from StandBy to Active. Contributed by Rohith Sharmaks (jianhe: rev 9cb8b75ba57f18639492bfa3b7e7c11c00bb3d3b)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #11 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/11/ ) YARN-2865 . Fixed RM to always create a new RMContext when transtions from StandBy to Active. Contributed by Rohith Sharmaks (jianhe: rev 9cb8b75ba57f18639492bfa3b7e7c11c00bb3d3b) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Hdfs-trunk #1939 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1939/)
        YARN-2865. Fixed RM to always create a new RMContext when transtions from StandBy to Active. Contributed by Rohith Sharmaks (jianhe: rev 9cb8b75ba57f18639492bfa3b7e7c11c00bb3d3b)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Hdfs-trunk #1939 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1939/ ) YARN-2865 . Fixed RM to always create a new RMContext when transtions from StandBy to Active. Contributed by Rohith Sharmaks (jianhe: rev 9cb8b75ba57f18639492bfa3b7e7c11c00bb3d3b) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk #1963 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1963/)
        YARN-2865. Fixed RM to always create a new RMContext when transtions from StandBy to Active. Contributed by Rohith Sharmaks (jianhe: rev 9cb8b75ba57f18639492bfa3b7e7c11c00bb3d3b)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1963 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1963/ ) YARN-2865 . Fixed RM to always create a new RMContext when transtions from StandBy to Active. Contributed by Rohith Sharmaks (jianhe: rev 9cb8b75ba57f18639492bfa3b7e7c11c00bb3d3b) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #11 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/11/)
        YARN-2865. Fixed RM to always create a new RMContext when transtions from StandBy to Active. Contributed by Rohith Sharmaks (jianhe: rev 9cb8b75ba57f18639492bfa3b7e7c11c00bb3d3b)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #11 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/11/ ) YARN-2865 . Fixed RM to always create a new RMContext when transtions from StandBy to Active. Contributed by Rohith Sharmaks (jianhe: rev 9cb8b75ba57f18639492bfa3b7e7c11c00bb3d3b) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Pulled this into 2.6.1. Ran compilation and TestRMHA before the push. Patch applied cleanly.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Pulled this into 2.6.1. Ran compilation and TestRMHA before the push. Patch applied cleanly.
        Hide
        s.a.rao@accenture.com s.a.rao@accenture.com added a comment -

        Hi,

        Issues with YARN were it is not becoming active due to the below issue and we are using CDH 5.3.0

        org.apache.hadoop.yarn.exceptions.YarnException: Application with id application_1470357060724_43131 is already present! Cannot add a duplicate!
        at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:45)
        at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:365)
        at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:310)
        at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:427)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1126)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:501)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at

        Please can you help us with resolving the above issue.

        Thanks,
        Sudhakar Rao

        Show
        s.a.rao@accenture.com s.a.rao@accenture.com added a comment - Hi, Issues with YARN were it is not becoming active due to the below issue and we are using CDH 5.3.0 org.apache.hadoop.yarn.exceptions.YarnException: Application with id application_1470357060724_43131 is already present! Cannot add a duplicate! at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:45) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:365) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:310) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:427) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1126) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:501) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at Please can you help us with resolving the above issue. Thanks, Sudhakar Rao

          People

          • Assignee:
            rohithsharma Rohith Sharma K S
            Reporter:
            rohithsharma Rohith Sharma K S
          • Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development