Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4431

Not necessary to do unRegisterNM() if NM get stop due to failed to connect to RM

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: nodemanager
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      2015-12-07 12:16:57,873 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
      2015-12-07 12:16:58,874 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
      2015-12-07 12:16:58,876 WARN org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Unregistration of the Node 10.200.10.53:25454 failed.
      java.net.ConnectException: Call From jduMBP.local/10.200.10.53 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
              at sun.reflect.GeneratedConstructorAccessor30.newInstance(Unknown Source)
              at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
              at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
              at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
              at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
              at org.apache.hadoop.ipc.Client.call(Client.java:1452)
              at org.apache.hadoop.ipc.Client.call(Client.java:1385)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
              at com.sun.proxy.$Proxy74.unRegisterNodeManager(Unknown Source)
              at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.unRegisterNodeManager(ResourceTrackerPBClientImpl.java:98)
              at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:483)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:255)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
              at com.sun.proxy.$Proxy75.unRegisterNodeManager(Unknown Source)
              at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.unRegisterNM(NodeStatusUpdaterImpl.java:267)
              at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStop(NodeStatusUpdaterImpl.java:245)
              at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
              at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
              at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
              at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157)
              at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131)
              at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStop(NodeManager.java:377)
      

      If RM down for some reason, NM's NodeStatusUpdaterImpl will retry the connection with proper retry policy. After retry the maximum times (15 minutes by default), it will send NodeManagerEventType.SHUTDOWN to shutdown NM. But NM shutdown will call NodeStatusUpdaterImpl.serviceStop() which will call unRegisterNM() to unregister NM from RM and get retry again (another 15 minutes). This is completely unnecessary and we should skip unRegisterNM when NM get shutdown because of connection issues.

        Activity

        Hide
        djp Junping Du added a comment -

        Upload a patch which is quite stright-forward so no need a unit test.

        Show
        djp Junping Du added a comment - Upload a patch which is quite stright-forward so no need a unit test.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 0s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
        +1 mvninstall 7m 34s trunk passed
        +1 compile 0m 25s trunk passed with JDK v1.8.0_66
        +1 compile 0m 27s trunk passed with JDK v1.7.0_91
        +1 checkstyle 0m 10s trunk passed
        +1 mvnsite 0m 28s trunk passed
        +1 mvneclipse 0m 13s trunk passed
        +1 findbugs 0m 56s trunk passed
        +1 javadoc 0m 18s trunk passed with JDK v1.8.0_66
        +1 javadoc 0m 21s trunk passed with JDK v1.7.0_91
        +1 mvninstall 0m 28s the patch passed
        +1 compile 0m 24s the patch passed with JDK v1.8.0_66
        +1 javac 0m 24s the patch passed
        +1 compile 0m 28s the patch passed with JDK v1.7.0_91
        +1 javac 0m 28s the patch passed
        -1 checkstyle 0m 10s Patch generated 1 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager (total was 30, now 30).
        +1 mvnsite 0m 29s the patch passed
        +1 mvneclipse 0m 13s the patch passed
        +1 whitespace 0m 0s Patch has no whitespace issues.
        +1 findbugs 1m 3s the patch passed
        +1 javadoc 0m 17s the patch passed with JDK v1.8.0_66
        +1 javadoc 0m 22s the patch passed with JDK v1.7.0_91
        +1 unit 8m 39s hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_66.
        +1 unit 9m 5s hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_91.
        +1 asflicense 0m 23s Patch does not generate ASF License warnings.
        33m 57s



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:0ca8df7
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12776149/YARN-4431.patch
        JIRA Issue YARN-4431
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux fd457c42c4ab 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 01a641b
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9890/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
        JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9890/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
        Max memory used 76MB
        Powered by Apache Yetus http://yetus.apache.org
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/9890/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 7m 34s trunk passed +1 compile 0m 25s trunk passed with JDK v1.8.0_66 +1 compile 0m 27s trunk passed with JDK v1.7.0_91 +1 checkstyle 0m 10s trunk passed +1 mvnsite 0m 28s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 0m 56s trunk passed +1 javadoc 0m 18s trunk passed with JDK v1.8.0_66 +1 javadoc 0m 21s trunk passed with JDK v1.7.0_91 +1 mvninstall 0m 28s the patch passed +1 compile 0m 24s the patch passed with JDK v1.8.0_66 +1 javac 0m 24s the patch passed +1 compile 0m 28s the patch passed with JDK v1.7.0_91 +1 javac 0m 28s the patch passed -1 checkstyle 0m 10s Patch generated 1 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager (total was 30, now 30). +1 mvnsite 0m 29s the patch passed +1 mvneclipse 0m 13s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 1m 3s the patch passed +1 javadoc 0m 17s the patch passed with JDK v1.8.0_66 +1 javadoc 0m 22s the patch passed with JDK v1.7.0_91 +1 unit 8m 39s hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_66. +1 unit 9m 5s hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_91. +1 asflicense 0m 23s Patch does not generate ASF License warnings. 33m 57s Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12776149/YARN-4431.patch JIRA Issue YARN-4431 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux fd457c42c4ab 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 01a641b findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9890/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt JDK v1.7.0_91 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9890/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Max memory used 76MB Powered by Apache Yetus http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-YARN-Build/9890/console This message was automatically generated.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        +1 lgtm

        Show
        rohithsharma Rohith Sharma K S added a comment - +1 lgtm
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        committing shortly

        Show
        rohithsharma Rohith Sharma K S added a comment - committing shortly
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        committed to trunk/branch-2/branch-2.8.. thanks Junping Du for the patch!!

        Show
        rohithsharma Rohith Sharma K S added a comment - committed to trunk/branch-2/branch-2.8.. thanks Junping Du for the patch!!
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #8945 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8945/)
        YARN-4431. Not necessary to do unRegisterNM() if NM get stop due to (rohithsharmaks: rev 15c3e7ffe3d1c57ad36afd993f09fc47889c93bd)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8945 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8945/ ) YARN-4431 . Not necessary to do unRegisterNM() if NM get stop due to (rohithsharmaks: rev 15c3e7ffe3d1c57ad36afd993f09fc47889c93bd) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #678 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/678/)
        YARN-4431. Not necessary to do unRegisterNM() if NM get stop due to (rohithsharmaks: rev 15c3e7ffe3d1c57ad36afd993f09fc47889c93bd)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #678 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/678/ ) YARN-4431 . Not necessary to do unRegisterNM() if NM get stop due to (rohithsharmaks: rev 15c3e7ffe3d1c57ad36afd993f09fc47889c93bd) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java hadoop-yarn-project/CHANGES.txt
        Hide
        djp Junping Du added a comment -

        Thanks Rohith Sharma K S for review and commit!

        Show
        djp Junping Du added a comment - Thanks Rohith Sharma K S for review and commit!

          People

          • Assignee:
            djp Junping Du
            Reporter:
            djp Junping Du
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development