Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3802

Two RMNodes for the same NodeId are used in RM sometimes after NM is reconnected.

    Details

      Description

      Two RMNodes for the same NodeId are used in RM sometimes after NM is reconnected. Scheduler and RMContext use different RMNode reference for the same NodeId sometimes after NM is reconnected, which is not correct. Scheduler and RMContext should always use same RMNode reference for the same NodeId.

      1. YARN-3802.001.patch
        8 kB
        zhihai xu
      2. YARN-3802.000.patch
        8 kB
        zhihai xu

        Issue Links

          Activity

          Hide
          zxu zhihai xu added a comment -

          I uploaded a patch YARN-3802.000.patch for review. The patch fixed the issue by using the old RMNode in NodeAddedSchedulerEvent and updating the old RMNode's capability based on the new RMNode's capability.

          Show
          zxu zhihai xu added a comment - I uploaded a patch YARN-3802 .000.patch for review. The patch fixed the issue by using the old RMNode in NodeAddedSchedulerEvent and updating the old RMNode's capability based on the new RMNode's capability.
          Hide
          hadoopqa Hadoop QA added a comment -



          +1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 16m 2s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 35s There were no new javac warning messages.
          +1 javadoc 9m 39s There were no new javadoc warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 48s There were no new checkstyle issues.
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 35s mvn install still works.
          +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
          +1 findbugs 1m 25s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 51m 11s Tests passed in hadoop-yarn-server-resourcemanager.
              89m 16s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12739524/YARN-3802.000.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / b0dc291
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8249/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8249/testReport/
          Java 1.7.0_55
          uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8249/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 16m 2s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 35s There were no new javac warning messages. +1 javadoc 9m 39s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 48s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 35s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 1m 25s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 51m 11s Tests passed in hadoop-yarn-server-resourcemanager.     89m 16s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12739524/YARN-3802.000.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / b0dc291 hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8249/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8249/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8249/console This message was automatically generated.
          Hide
          xgong Xuan Gong added a comment -

          zhihai xu The patch looks good overall. One nit:
          Could you fix the comment, too ?

                       // Only add new node if old state is RUNNING
          
          Show
          xgong Xuan Gong added a comment - zhihai xu The patch looks good overall. One nit: Could you fix the comment, too ? // Only add new node if old state is RUNNING
          Hide
          zxu zhihai xu added a comment -

          Xuan Gong, thanks for the review. I uploaded a new patch YARN-3802.001.patch, which fixed the comment. Please review it.

          Show
          zxu zhihai xu added a comment - Xuan Gong , thanks for the review. I uploaded a new patch YARN-3802 .001.patch, which fixed the comment. Please review it.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 15m 55s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 33s There were no new javac warning messages.
          +1 javadoc 9m 37s There were no new javadoc warning messages.
          +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 45s There were no new checkstyle issues.
          +1 whitespace 0m 1s The patch has no lines that end in whitespace.
          +1 install 1m 33s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 1m 25s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          -1 yarn tests 61m 4s Tests failed in hadoop-yarn-server-resourcemanager.
              98m 52s  



          Reason Tests
          Timed out tests org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12740317/YARN-3802.001.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 6e0a9f9
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8286/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8286/testReport/
          Java 1.7.0_55
          uname Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8286/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 15m 55s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 33s There were no new javac warning messages. +1 javadoc 9m 37s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 45s There were no new checkstyle issues. +1 whitespace 0m 1s The patch has no lines that end in whitespace. +1 install 1m 33s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 1m 25s The patch does not introduce any new Findbugs (version 3.0.0) warnings. -1 yarn tests 61m 4s Tests failed in hadoop-yarn-server-resourcemanager.     98m 52s   Reason Tests Timed out tests org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12740317/YARN-3802.001.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 6e0a9f9 hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8286/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8286/testReport/ Java 1.7.0_55 uname Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8286/console This message was automatically generated.
          Hide
          zxu zhihai xu added a comment -

          The test failure for TestNodeLabelContainerAllocation is not related to my change because no test failure is show in the test report: https://builds.apache.org/job/PreCommit-YARN-Build/8286/testReport/

          Show
          zxu zhihai xu added a comment - The test failure for TestNodeLabelContainerAllocation is not related to my change because no test failure is show in the test report: https://builds.apache.org/job/PreCommit-YARN-Build/8286/testReport/
          Hide
          xgong Xuan Gong added a comment -

          +1 LGTM. Will commit

          Show
          xgong Xuan Gong added a comment - +1 LGTM. Will commit
          Hide
          xgong Xuan Gong added a comment -

          Committed into trunk/branch-2. Thanks, zhihai.

          Show
          xgong Xuan Gong added a comment - Committed into trunk/branch-2. Thanks, zhihai.
          Hide
          zxu zhihai xu added a comment -

          thanks Xuan Gong for the review and committing the patch.

          Show
          zxu zhihai xu added a comment - thanks Xuan Gong for the review and committing the patch.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #233 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/233/)
          YARN-3802. Two RMNodes for the same NodeId are used in RM sometimes (xgong: rev 5b5bb8dcdc888ba1ebc7e4eba0fa0e7e79edda9a)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMReconnect.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #233 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/233/ ) YARN-3802 . Two RMNodes for the same NodeId are used in RM sometimes (xgong: rev 5b5bb8dcdc888ba1ebc7e4eba0fa0e7e79edda9a) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMReconnect.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #963 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/963/)
          YARN-3802. Two RMNodes for the same NodeId are used in RM sometimes (xgong: rev 5b5bb8dcdc888ba1ebc7e4eba0fa0e7e79edda9a)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMReconnect.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #963 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/963/ ) YARN-3802 . Two RMNodes for the same NodeId are used in RM sometimes (xgong: rev 5b5bb8dcdc888ba1ebc7e4eba0fa0e7e79edda9a) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMReconnect.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2161 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2161/)
          YARN-3802. Two RMNodes for the same NodeId are used in RM sometimes (xgong: rev 5b5bb8dcdc888ba1ebc7e4eba0fa0e7e79edda9a)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMReconnect.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2161 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2161/ ) YARN-3802 . Two RMNodes for the same NodeId are used in RM sometimes (xgong: rev 5b5bb8dcdc888ba1ebc7e4eba0fa0e7e79edda9a) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMReconnect.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #222 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/222/)
          YARN-3802. Two RMNodes for the same NodeId are used in RM sometimes (xgong: rev 5b5bb8dcdc888ba1ebc7e4eba0fa0e7e79edda9a)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMReconnect.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #222 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/222/ ) YARN-3802 . Two RMNodes for the same NodeId are used in RM sometimes (xgong: rev 5b5bb8dcdc888ba1ebc7e4eba0fa0e7e79edda9a) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMReconnect.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #231 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/231/)
          YARN-3802. Two RMNodes for the same NodeId are used in RM sometimes (xgong: rev 5b5bb8dcdc888ba1ebc7e4eba0fa0e7e79edda9a)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMReconnect.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #231 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/231/ ) YARN-3802 . Two RMNodes for the same NodeId are used in RM sometimes (xgong: rev 5b5bb8dcdc888ba1ebc7e4eba0fa0e7e79edda9a) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMReconnect.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2179 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2179/)
          YARN-3802. Two RMNodes for the same NodeId are used in RM sometimes (xgong: rev 5b5bb8dcdc888ba1ebc7e4eba0fa0e7e79edda9a)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMReconnect.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2179 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2179/ ) YARN-3802 . Two RMNodes for the same NodeId are used in RM sometimes (xgong: rev 5b5bb8dcdc888ba1ebc7e4eba0fa0e7e79edda9a) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMReconnect.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #8039 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8039/)
          YARN-3802. Two RMNodes for the same NodeId are used in RM sometimes (xgong: rev 5b5bb8dcdc888ba1ebc7e4eba0fa0e7e79edda9a)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMReconnect.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8039 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8039/ ) YARN-3802 . Two RMNodes for the same NodeId are used in RM sometimes (xgong: rev 5b5bb8dcdc888ba1ebc7e4eba0fa0e7e79edda9a) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMReconnect.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java hadoop-yarn-project/CHANGES.txt
          Hide
          jlowe Jason Lowe added a comment -

          I committed this to branch-2.7 and branch-2.6 as well.

          Show
          jlowe Jason Lowe added a comment - I committed this to branch-2.7 and branch-2.6 as well.

            People

            • Assignee:
              zxu zhihai xu
              Reporter:
              zxu zhihai xu
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development