Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3194

RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.7.0
    • Fix Version/s: 2.7.0, 2.6.2, 3.0.0-alpha1
    • Component/s: resourcemanager
    • Labels:
      None
    • Environment:

      NM restart is enabled

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      On NM restart ,NM sends all the outstanding NMContainerStatus to RM during registration. The registration can be treated by RM as New node or Reconnecting node. RM triggers corresponding event on the basis of node added or node reconnected state.

      1. Node added event : Again here 2 scenario's can occur
        1. New node is registering with different ip:port – NOT A PROBLEM
        2. Old node is re-registering because of RESYNC command from RM when RM restart – NOT A PROBLEM
      1. Node reconnected event :
        1. Existing node is re-registering i.e RM treat it as reconnecting node when RM is not restarted
          1. NM RESTART NOT Enabled – NOT A PROBLEM
          2. NM RESTART is Enabled
            1. Some applications are running on this node – Problem is here
            2. Zero applications are running on this node – NOT A PROBLEM

      Since NMContainerStatus are not handled, RM never get to know about completedContainer and never release resource held be containers. RM will not allocate new containers for pending resource request as long as the completedContainer event is triggered. This results in applications to wait indefinitly because of pending containers are not served by RM.

      1. 0001-YARN-3194.patch
        18 kB
        Rohith Sharma K S
      2. 0001-yarn-3194-v1.patch
        13 kB
        Rohith Sharma K S

        Activity

        Hide
        jlowe Jason Lowe added a comment -

        I committed this to branch-2.6 as well.

        Show
        jlowe Jason Lowe added a comment - I committed this to branch-2.6 as well.
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk #2062 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2062/)
        YARN-3194. RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2062 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2062/ ) YARN-3194 . RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #112 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/112/)
        YARN-3194. RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #112 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/112/ ) YARN-3194 . RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java hadoop-yarn-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #102 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/102/)
        YARN-3194. RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #102 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/102/ ) YARN-3194 . RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java hadoop-yarn-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk #2043 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2043/)
        YARN-3194. RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2043 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2043/ ) YARN-3194 . RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Yarn-trunk #845 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/845/)
        YARN-3194. RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk #845 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/845/ ) YARN-3194 . RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #111 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/111/)
        YARN-3194. RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #111 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/111/ ) YARN-3194 . RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java hadoop-yarn-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #7162 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7162/)
        YARN-3194. RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #7162 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7162/ ) YARN-3194 . RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
        Hide
        jlowe Jason Lowe added a comment -

        Thanks to Rohith for the contribution and to Jian and Junping for additional review! I committed this to trunk and branch-2.

        Show
        jlowe Jason Lowe added a comment - Thanks to Rohith for the contribution and to Jian and Junping for additional review! I committed this to trunk and branch-2.
        Hide
        djp Junping Du added a comment -

        lgtm three.

        Show
        djp Junping Du added a comment - lgtm three.
        Hide
        jianhe Jian He added a comment -

        lgtm too

        Show
        jianhe Jian He added a comment - lgtm too
        Hide
        jlowe Jason Lowe added a comment -

        +1 lgtm. Will commit this tomorrow if there are no further comments.

        Show
        jlowe Jason Lowe added a comment - +1 lgtm. Will commit this tomorrow if there are no further comments.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        FIndbugs warnings are unrelated to this Jira. These warnings will be handled as part of YARN-3204

        Show
        rohithsharma Rohith Sharma K S added a comment - FIndbugs warnings are unrelated to this Jira. These warnings will be handled as part of YARN-3204
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12699499/0001-YARN-3194.patch
        against trunk revision 2ecea5a.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        -1 findbugs. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6660//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6660//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6660//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12699499/0001-YARN-3194.patch against trunk revision 2ecea5a. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6660//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6660//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6660//console This message is automatically generated.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Attached the patch addressing all the above comments.. Kindly review the new patch..

        Show
        rohithsharma Rohith Sharma K S added a comment - Attached the patch addressing all the above comments.. Kindly review the new patch..
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Thanks Jason Lowe Junping Du Jian He for detailed review

        the container status processing code is almost a duplicate of the same code in StatusUpdateWhenHealthyTransition

        Agree, this has to be refactored. Majority of processing containerStatus code is same.

        we don't remove containers that have completed from the launchedContainers map which seems wrong

        I see, yes. completed containers should be removed from launchedContainers.

        I don't see why we would process container status sent during a reconnect differently than a regular status update from the NM

        IIUC it is only to deal with NMContainerStatus and containerStatus. But I am not sure why these both created differently. What I see is containerStatus is subset of NMcontainerStatus. I think containerStatus would have been inside NMContainerStatus.

        Is below condition valid for the newly added code in ReconnectNodeTransition too ?

        Yes, it is applicable since we are keeping old RMNode object.

        Add timeout to the test, testAppCleanupWhenNMRstarts -> testProcessingContainerStatusesOnNMRestart ? and add more detailed comments about what the test is doing too ?

        Agree.

        Could you add a validation that ApplicationMasterService#allocate indeed receives the completed container in this scenario?

        Agree, I will add

        Question: does the 3072 include 1024 for the AM container and 2048 for the allocated container ?

        AM memory is 1024 and additional requested container memory is 2048. In test, number of request container is 1. So AllocatedMB should be AM+Requested i.e 1024+2048=3072

        Show
        rohithsharma Rohith Sharma K S added a comment - Thanks Jason Lowe Junping Du Jian He for detailed review the container status processing code is almost a duplicate of the same code in StatusUpdateWhenHealthyTransition Agree, this has to be refactored. Majority of processing containerStatus code is same. we don't remove containers that have completed from the launchedContainers map which seems wrong I see, yes. completed containers should be removed from launchedContainers. I don't see why we would process container status sent during a reconnect differently than a regular status update from the NM IIUC it is only to deal with NMContainerStatus and containerStatus. But I am not sure why these both created differently. What I see is containerStatus is subset of NMcontainerStatus. I think containerStatus would have been inside NMContainerStatus. Is below condition valid for the newly added code in ReconnectNodeTransition too ? Yes, it is applicable since we are keeping old RMNode object. Add timeout to the test, testAppCleanupWhenNMRstarts -> testProcessingContainerStatusesOnNMRestart ? and add more detailed comments about what the test is doing too ? Agree. Could you add a validation that ApplicationMasterService#allocate indeed receives the completed container in this scenario? Agree, I will add Question: does the 3072 include 1024 for the AM container and 2048 for the allocated container ? AM memory is 1024 and additional requested container memory is 2048. In test, number of request container is 1. So AllocatedMB should be AM+Requested i.e 1024+2048=3072
        Hide
        djp Junping Du added a comment -

        Should be a blocker to 2.7 as it blocks rolling upgrade feature which works in 2.6.

        Show
        djp Junping Du added a comment - Should be a blocker to 2.7 as it blocks rolling upgrade feature which works in 2.6.
        Hide
        djp Junping Du added a comment -

        Update affect version to be 2.7. May be a blocker?

        Show
        djp Junping Du added a comment - Update affect version to be 2.7. May be a blocker?
        Hide
        djp Junping Du added a comment -

        I didn't see this problem originally, but I suspect it was because there were two things that masked it. As mentioned above, this problem doesn't manifest before YARN-2997. In addition, I was testing it with MapReduce applications, and the MR AM will explicitly kill containers for tasks that have completed (as reported by the umbilical connection between the AM and tasks).

        I see. I think that's why we didn't notice this issue before. However, this bug should happen after YARN-2997, so we should mark affected version to be 2.7.

        I don't see why we would process container status sent during a reconnect differently than a regular status update from the NM.

        I think we can do some code refactor work here. However, I think two things could be different between reconnect and regular resource update: 1. Port number could be changed (use ephemeral port when disable NM work preserving); 2. Resource could be updated (assume NM's resource could be updated before). Isn't it?

        Show
        djp Junping Du added a comment - I didn't see this problem originally, but I suspect it was because there were two things that masked it. As mentioned above, this problem doesn't manifest before YARN-2997 . In addition, I was testing it with MapReduce applications, and the MR AM will explicitly kill containers for tasks that have completed (as reported by the umbilical connection between the AM and tasks). I see. I think that's why we didn't notice this issue before. However, this bug should happen after YARN-2997 , so we should mark affected version to be 2.7. I don't see why we would process container status sent during a reconnect differently than a regular status update from the NM. I think we can do some code refactor work here. However, I think two things could be different between reconnect and regular resource update: 1. Port number could be changed (use ephemeral port when disable NM work preserving); 2. Resource could be updated (assume NM's resource could be updated before). Isn't it?
        Hide
        jianhe Jian He added a comment -

        Didn't see Jason's comments, agree with his comments too.

        Show
        jianhe Jian He added a comment - Didn't see Jason's comments, agree with his comments too.
        Hide
        jianhe Jian He added a comment -

        Rohith Sharma K S, thanks for your explanation. could you edit the description to be more clear about the problem ?

        • Is it possible to have a common method for below code in ReconnectNodeTransition and StatusUpdateWhenHealthyTransition ?
                  // Filter the map to only obtain just launched containers and finished
                  // containers.
                  List<ContainerStatus> newlyLaunchedContainers =
                      new ArrayList<ContainerStatus>();
                  List<ContainerStatus> completedContainers =
                      new ArrayList<ContainerStatus>();
                  for (NMContainerStatus remoteContainer : reconnectEvent
                      .getNMContainerStatuses()) {
                    ContainerId containerId = remoteContainer.getContainerId();
          
                    // Process running containers
                    if (remoteContainer.getContainerState() == ContainerState.RUNNING) {
                      if (!rmNode.launchedContainers.contains(containerId)) {
                        // Just launched container. RM knows about it the first time.
                        rmNode.launchedContainers.add(containerId);
                        ContainerStatus cStatus = createContainerStatus(remoteContainer);
                        newlyLaunchedContainers.add(cStatus);
                      }
                    } else {
          
                      ContainerStatus cStatus = createContainerStatus(remoteContainer);
                      completedContainers.add(cStatus);
                    }
                  }
                  if (newlyLaunchedContainers.size() != 0
                      || completedContainers.size() != 0) {
                    rmNode.nodeUpdateQueue.add(new UpdatedContainerInfo(
                        newlyLaunchedContainers, completedContainers));
                  }
          
        • Is below condition valid for the newly added code in ReconnectNodeTransition too ?
                  // Don't bother with containers already scheduled for cleanup, or for
                  // applications already killed. The scheduler doens't need to know any
                  // more about this container
                  if (rmNode.containersToClean.contains(containerId)) {
                    LOG.info("Container " + containerId + " already scheduled for " +
                    		"cleanup, no further processing");
                    continue;
                  }
                  if (rmNode.finishedApplications.contains(containerId
                      .getApplicationAttemptId().getApplicationId())) {
                    LOG.info("Container " + containerId
                        + " belongs to an application that is already killed,"
                        + " no further processing");
                    continue;
                  }
          
        • Add timeout to the test, testAppCleanupWhenNMRstarts -> testProcessingContainerStatusesOnNMRestart ? and add more detailed comments about what the test is doing too ?
          @Test
            public void testAppCleanupWhenNMRstarts() throws Exception
          
        • Question: does the 3072 include 1024 for the AM container and 2048 for the allocated container ?
           Assert.assertEquals(3072, allocatedMB);
          
        • Could you add a validation that ApplicationMasterService#allocate indeed receives the completed container in this scenario?
        Show
        jianhe Jian He added a comment - Rohith Sharma K S , thanks for your explanation. could you edit the description to be more clear about the problem ? Is it possible to have a common method for below code in ReconnectNodeTransition and StatusUpdateWhenHealthyTransition ? // Filter the map to only obtain just launched containers and finished // containers. List<ContainerStatus> newlyLaunchedContainers = new ArrayList<ContainerStatus>(); List<ContainerStatus> completedContainers = new ArrayList<ContainerStatus>(); for (NMContainerStatus remoteContainer : reconnectEvent .getNMContainerStatuses()) { ContainerId containerId = remoteContainer.getContainerId(); // Process running containers if (remoteContainer.getContainerState() == ContainerState.RUNNING) { if (!rmNode.launchedContainers.contains(containerId)) { // Just launched container. RM knows about it the first time. rmNode.launchedContainers.add(containerId); ContainerStatus cStatus = createContainerStatus(remoteContainer); newlyLaunchedContainers.add(cStatus); } } else { ContainerStatus cStatus = createContainerStatus(remoteContainer); completedContainers.add(cStatus); } } if (newlyLaunchedContainers.size() != 0 || completedContainers.size() != 0) { rmNode.nodeUpdateQueue.add( new UpdatedContainerInfo( newlyLaunchedContainers, completedContainers)); } Is below condition valid for the newly added code in ReconnectNodeTransition too ? // Don't bother with containers already scheduled for cleanup, or for // applications already killed. The scheduler doens't need to know any // more about this container if (rmNode.containersToClean.contains(containerId)) { LOG.info( "Container " + containerId + " already scheduled for " + "cleanup, no further processing" ); continue ; } if (rmNode.finishedApplications.contains(containerId .getApplicationAttemptId().getApplicationId())) { LOG.info( "Container " + containerId + " belongs to an application that is already killed," + " no further processing" ); continue ; } Add timeout to the test, testAppCleanupWhenNMRstarts -> testProcessingContainerStatusesOnNMRestart ? and add more detailed comments about what the test is doing too ? @Test public void testAppCleanupWhenNMRstarts() throws Exception Question: does the 3072 include 1024 for the AM container and 2048 for the allocated container ? Assert.assertEquals(3072, allocatedMB); Could you add a validation that ApplicationMasterService#allocate indeed receives the completed container in this scenario?
        Hide
        jlowe Jason Lowe added a comment -

        Jason Lowe, I remember we discussed this case in some JIRA under YARN-1336, did you see this problem before?

        I didn't see this problem originally, but I suspect it was because there were two things that masked it. As mentioned above, this problem doesn't manifest before YARN-2997. In addition, I was testing it with MapReduce applications, and the MR AM will explicitly kill containers for tasks that have completed (as reported by the umbilical connection between the AM and tasks).

        I agree that we should be processing the container report sent with the NM registration, and it appears that is being dropped in the reconnected event.

        Comments on the patch:

        I noticed that the container status processing code is almost a duplicate of the same code in StatusUpdateWhenHealthyTransition. One difference is that we don't remove containers that have completed from the launchedContainers map which seems wrong. I don't see why we would process container status sent during a reconnect differently than a regular status update from the NM. Therefore I think we should refactor the code to reuse this logic, as it should apply here just as it does for StatusUpdateWhenHealthyTransition.

        Show
        jlowe Jason Lowe added a comment - Jason Lowe, I remember we discussed this case in some JIRA under YARN-1336 , did you see this problem before? I didn't see this problem originally, but I suspect it was because there were two things that masked it. As mentioned above, this problem doesn't manifest before YARN-2997 . In addition, I was testing it with MapReduce applications, and the MR AM will explicitly kill containers for tasks that have completed (as reported by the umbilical connection between the AM and tasks). I agree that we should be processing the container report sent with the NM registration, and it appears that is being dropped in the reconnected event. Comments on the patch: I noticed that the container status processing code is almost a duplicate of the same code in StatusUpdateWhenHealthyTransition. One difference is that we don't remove containers that have completed from the launchedContainers map which seems wrong. I don't see why we would process container status sent during a reconnect differently than a regular status update from the NM. Therefore I think we should refactor the code to reuse this logic, as it should apply here just as it does for StatusUpdateWhenHealthyTransition.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Junping Du I see the same behaviour which you explained after NM restart from NM logs.

        Show
        rohithsharma Rohith Sharma K S added a comment - Junping Du I see the same behaviour which you explained after NM restart from NM logs.
        Hide
        djp Junping Du added a comment -

        I think NM after restarted will try to relaunch these running containers as RecoveredContainers, and if it cannot locate the pid (assume container get completed during NM downtime), it would report and trigger the complete of containers. Do I miss anything here?
        Jason Lowe, I remember we discussed this case in some JIRA under YARN-1336, did you see this problem before?

        Show
        djp Junping Du added a comment - I think NM after restarted will try to relaunch these running containers as RecoveredContainers, and if it cannot locate the pid (assume container get completed during NM downtime), it would report and trigger the complete of containers. Do I miss anything here? Jason Lowe , I remember we discussed this case in some JIRA under YARN-1336 , did you see this problem before?
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12699148/0001-yarn-3194-v1.patch
        against trunk revision 814afa4.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        -1 findbugs. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

        org.apache.hadoop.yarn.server.resourcemanager.recovery.TestFSRMStateStore

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6647//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6647//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6647//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12699148/0001-yarn-3194-v1.patch against trunk revision 814afa4. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.recovery.TestFSRMStateStore Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6647//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6647//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6647//console This message is automatically generated.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Attached the version-1 patch.
        The patch does following

        1. Added ReconnectedEvent to process NMContainerStatus if applications are running on the node

        Kindly review the patch

        Show
        rohithsharma Rohith Sharma K S added a comment - Attached the version-1 patch. The patch does following Added ReconnectedEvent to process NMContainerStatus if applications are running on the node Kindly review the patch
        Hide
        devaraj.k Devaraj K added a comment -

        Thanks Rohith Sharma K S for reporting and Jian He for your inputs.

        I am also able to reproduce this issue, I see that RM is not getting information about the completed containers after NM restart as part of the nodeheartbeat request and also AM is not informing the RM to release these completed containers. As a result RM is assuming these containers are running and not releasing these completed container resources.

        Show
        devaraj.k Devaraj K added a comment - Thanks Rohith Sharma K S for reporting and Jian He for your inputs. I am also able to reproduce this issue, I see that RM is not getting information about the completed containers after NM restart as part of the nodeheartbeat request and also AM is not informing the RM to release these completed containers. As a result RM is assuming these containers are running and not releasing these completed container resources.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        it's removing the old node and adding the newly connected node. RM is also not restarted.

        RMNodeImpl#ReconnectNodeTransition#.transition does not remove old node if any applications are running. In the below code, if noRunningApps is false then Node is not removed. Instead just handling running applications.

        public void transition(RMNodeImpl rmNode, RMNodeEvent event) {
              RMNodeReconnectEvent reconnectEvent = (RMNodeReconnectEvent) event;
              RMNode newNode = reconnectEvent.getReconnectedNode();
              rmNode.nodeManagerVersion = newNode.getNodeManagerVersion();
              List<ApplicationId> runningApps = reconnectEvent.getRunningApplications();
              boolean noRunningApps = 
                  (runningApps == null) || (runningApps.size() == 0);
              
              // No application running on the node, so send node-removal event with 
              // cleaning up old container info.
              if (noRunningApps) {
                // Remove the node from scheduler
                // Add node to the scheduler
              } else {
                rmNode.httpPort = newNode.getHttpPort();
                rmNode.httpAddress = newNode.getHttpAddress();
                rmNode.totalCapability = newNode.getTotalCapability();
              
                // Reset heartbeat ID since node just restarted.
                rmNode.getLastNodeHeartBeatResponse().setResponseId(0);
              }
        
              // Handles running app on this node
            // resource update to schedule code
              }
              
            }
        
        Show
        rohithsharma Rohith Sharma K S added a comment - it's removing the old node and adding the newly connected node. RM is also not restarted. RMNodeImpl#ReconnectNodeTransition#.transition does not remove old node if any applications are running. In the below code, if noRunningApps is false then Node is not removed. Instead just handling running applications. public void transition(RMNodeImpl rmNode, RMNodeEvent event) { RMNodeReconnectEvent reconnectEvent = (RMNodeReconnectEvent) event; RMNode newNode = reconnectEvent.getReconnectedNode(); rmNode.nodeManagerVersion = newNode.getNodeManagerVersion(); List<ApplicationId> runningApps = reconnectEvent.getRunningApplications(); boolean noRunningApps = (runningApps == null ) || (runningApps.size() == 0); // No application running on the node, so send node-removal event with // cleaning up old container info. if (noRunningApps) { // Remove the node from scheduler // Add node to the scheduler } else { rmNode.httpPort = newNode.getHttpPort(); rmNode.httpAddress = newNode.getHttpAddress(); rmNode.totalCapability = newNode.getTotalCapability(); // Reset heartbeat ID since node just restarted. rmNode.getLastNodeHeartBeatResponse().setResponseId(0); } // Handles running app on this node // resource update to schedule code } }
        Hide
        jianhe Jian He added a comment -

        I have one doubt on this method ResourceTrackerService#handleNMContainerStatus

        This is legacy code for non-work-preserving restart. we could remove that. Just disregard this method.

        NM RESTART is Enabled – Problem is here

        For node_reconnect event, it's removing the old node and adding the newly connected node. RM is also not restarted. I don't think we need to handle the RMNodeReconnectEvent

        Show
        jianhe Jian He added a comment - I have one doubt on this method ResourceTrackerService#handleNMContainerStatus This is legacy code for non-work-preserving restart. we could remove that. Just disregard this method. NM RESTART is Enabled – Problem is here For node_reconnect event, it's removing the old node and adding the newly connected node. RM is also not restarted. I don't think we need to handle the RMNodeReconnectEvent
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        And Not related specific to this jira, I have one doubt on this method ResourceTrackerService#handleNMContainerStatus is sending container_finished event only to master container. Why other containers are not considered? I think it is made intentionally for optimization so if container_finished event would release other containers resources. Is this is the reason?

        Show
        rohithsharma Rohith Sharma K S added a comment - And Not related specific to this jira, I have one doubt on this method ResourceTrackerService#handleNMContainerStatus is sending container_finished event only to master container. Why other containers are not considered? I think it is made intentionally for optimization so if container_finished event would release other containers resources. Is this is the reason?
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Thanks Jian He for pointing me out container recovery flow!! Issue priority can decided later,not a problem.

        I had deeper look about NM registration flow. There are 2 scenario's can occur

        1. Node added event : Again here 2 scenario's can occur
          1. New node is registering with different ip:port – NOT A PROBLEM
          2. Old node is re-registering because of RESYNC command from RM when RM restart – NOT A PROBLEM
        2. Node reconnected event :
          1. Existing node is re-registering i.e RM treat it as reconnecting node when RM is not restarted
            1. NM RESTART NOT Enabled – NOT A PROBLEM
            2. NM RESTART is Enabled – Problem is here
              When Node is reconnected and applications are running in that node, NMContainerStatus are ignored. I think RMNodeReconnectEvent should consider NMContainerStatus and process it.
        Show
        rohithsharma Rohith Sharma K S added a comment - Thanks Jian He for pointing me out container recovery flow!! Issue priority can decided later,not a problem. I had deeper look about NM registration flow. There are 2 scenario's can occur Node added event : Again here 2 scenario's can occur New node is registering with different ip:port – NOT A PROBLEM Old node is re-registering because of RESYNC command from RM when RM restart – NOT A PROBLEM Node reconnected event : Existing node is re-registering i.e RM treat it as reconnecting node when RM is not restarted NM RESTART NOT Enabled – NOT A PROBLEM NM RESTART is Enabled – Problem is here When Node is reconnected and applications are running in that node, NMContainerStatus are ignored. I think RMNodeReconnectEvent should consider NMContainerStatus and process it.
        Hide
        jianhe Jian He added a comment -

        I'm downgrading the priority for now. Please raise the priority if you still think this is a real issue.

        Show
        jianhe Jian He added a comment - I'm downgrading the priority for now. Please raise the priority if you still think this is a real issue.
        Hide
        jianhe Jian He added a comment -

        RM(RMNodeImpl.AddNodeTransition#transition) is processing only RUNNING containers. COMPLETED containers are ignored.

        Completed containers are also processed. please refer to {{RMContainerImpl#ContainerRecoveredTransition}.
        Both running and completed containers sent by NM on re-registration will be processed by the new RM and routed back to the AM.

        Show
        jianhe Jian He added a comment - RM(RMNodeImpl.AddNodeTransition#transition) is processing only RUNNING containers. COMPLETED containers are ignored. Completed containers are also processed. please refer to {{RMContainerImpl#ContainerRecoveredTransition}. Both running and completed containers sent by NM on re-registration will be processed by the new RM and routed back to the AM.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        The current flow of notifying applications for completed containers are via RM.

        1. AM sends stopContainerRequest to NM
        2. NM kills the container and send containerStatus to RM in heartbeat ONLY ONCE. NM keeps this container until RM piggybacks to NM saying container can be removed.
        3. RM releases the resources which are allocated for containers. And sends back to NM for removing container from NM's tracking.
        4. In AM heartbeat to AM, RM sends completed containers details to AM

        ICO NM Restart, NM sends outstanding NMContainerStatus lists to RM during NM registration. This status has boh RUNNING and COMPLETED containers.These containers status never send again by RM. But on NM registration , RM(RMNodeImpl.AddNodeTransition#transition) is processing only RUNNING containers. COMPLETED containers are ignored. The impact from this is Resources which are being allocated to these containers never be released from RM. At this time, RM state is completely inconsistent with actual cluster state.

        Application is hanging because in the above inconsistent state, RM never allocate new containers which are asked by AM.So AM will be keep waiting for containers from RM.

        Earlier to YARN-2997 fix, it should work fine because those completed containers which are sent in NM registration are sent again in 1st heartbeat of NM. So RMNodeImpl handles completed containers in nodeHeartBeat event. The bug was hidden before yarn-2997 because NM was sending duplicated container status. But actually RM should had handle it.
        Does it make sense?

        Show
        rohithsharma Rohith Sharma K S added a comment - The current flow of notifying applications for completed containers are via RM. AM sends stopContainerRequest to NM NM kills the container and send containerStatus to RM in heartbeat ONLY ONCE. NM keeps this container until RM piggybacks to NM saying container can be removed. RM releases the resources which are allocated for containers. And sends back to NM for removing container from NM's tracking. In AM heartbeat to AM, RM sends completed containers details to AM ICO NM Restart, NM sends outstanding NMContainerStatus lists to RM during NM registration. This status has boh RUNNING and COMPLETED containers.These containers status never send again by RM. But on NM registration , RM(RMNodeImpl.AddNodeTransition#transition) is processing only RUNNING containers. COMPLETED containers are ignored. The impact from this is Resources which are being allocated to these containers never be released from RM. At this time, RM state is completely inconsistent with actual cluster state. Application is hanging because in the above inconsistent state, RM never allocate new containers which are asked by AM.So AM will be keep waiting for containers from RM. Earlier to YARN-2997 fix, it should work fine because those completed containers which are sent in NM registration are sent again in 1st heartbeat of NM. So RMNodeImpl handles completed containers in nodeHeartBeat event. The bug was hidden before yarn-2997 because NM was sending duplicated container status. But actually RM should had handle it. Does it make sense?
        Hide
        jianhe Jian He added a comment -

        Rohith Sharma K S, the completed containers are also notified to applications. mind clarify more how the application is hung ?

        Show
        jianhe Jian He added a comment - Rohith Sharma K S , the completed containers are also notified to applications. mind clarify more how the application is hung ?

          People

          • Assignee:
            rohithsharma Rohith Sharma K S
            Reporter:
            rohithsharma Rohith Sharma K S
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development