Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5315

Standby RM keep sending start am container request to NM

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • None
    • None
    • resourcemanager

    Description

      1, network partitions, RM couldn't connect to NMs and start AM request pending
      2, RM becomes standby, int ApplicatioinMasterLauncher#serviceStop, launcherPool are shutdown. the launching thread are interrupted, but start AM request may still left in Queue
      3,network reconnect, standby RM sends start AM request to NM.

      Attachments

        1. YARN-5315.01.patch
          1 kB
          sandflee
        2. YARN-5315.02.patch
          1 kB
          sandflee

        Activity

          Good catch!! +lgtm..

          rohithsharma Rohith Sharma K S added a comment - Good catch!! +lgtm..
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 20s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 mvninstall 7m 4s trunk passed
          +1 compile 0m 33s trunk passed
          +1 checkstyle 0m 21s trunk passed
          +1 mvnsite 0m 38s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 1m 0s trunk passed
          +1 javadoc 0m 20s trunk passed
          +1 mvninstall 0m 32s the patch passed
          +1 compile 0m 30s the patch passed
          +1 javac 0m 30s the patch passed
          +1 checkstyle 0m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 4 unchanged - 3 fixed = 4 total (was 7)
          +1 mvnsite 0m 34s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 7s the patch passed
          +1 javadoc 0m 18s the patch passed
          +1 unit 32m 31s hadoop-yarn-server-resourcemanager in the patch passed.
          +1 asflicense 0m 17s The patch does not generate ASF License warnings.
          47m 27s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12816436/YARN-5315.01.patch
          JIRA Issue YARN-5315
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux dea944e80d71 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / d792a90
          Default Java 1.8.0_91
          findbugs v3.0.0
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/12198/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/12198/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 20s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 7m 4s trunk passed +1 compile 0m 33s trunk passed +1 checkstyle 0m 21s trunk passed +1 mvnsite 0m 38s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 1m 0s trunk passed +1 javadoc 0m 20s trunk passed +1 mvninstall 0m 32s the patch passed +1 compile 0m 30s the patch passed +1 javac 0m 30s the patch passed +1 checkstyle 0m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 4 unchanged - 3 fixed = 4 total (was 7) +1 mvnsite 0m 34s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 7s the patch passed +1 javadoc 0m 18s the patch passed +1 unit 32m 31s hadoop-yarn-server-resourcemanager in the patch passed. +1 asflicense 0m 17s The patch does not generate ASF License warnings. 47m 27s Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12816436/YARN-5315.01.patch JIRA Issue YARN-5315 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux dea944e80d71 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / d792a90 Default Java 1.8.0_91 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/12198/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/12198/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.

          shutdownNow() still does not wait for actively executing tasks to terminate. You should make an explicit awaitTermination() call to do that.

          We should also call "super.serviceStop()" to complete the life-cycle, can you make that change too?

          vinodkv Vinod Kumar Vavilapalli added a comment - shutdownNow() still does not wait for actively executing tasks to terminate. You should make an explicit awaitTermination() call to do that. We should also call "super.serviceStop()" to complete the life-cycle, can you make that change too?
          sandflee sandflee added a comment -

          shutdownNow() still does not wait for actively executing tasks to terminate. You should make an explicit awaitTermination() call to do that.

          awaitTerminatioin will block until all tasks terminate, this may delay the stop process of other service, should we do that?

          We should also call "super.serviceStop()" to complete the life-cycle, can you make that change too?

          will do

          sandflee sandflee added a comment - shutdownNow() still does not wait for actively executing tasks to terminate. You should make an explicit awaitTermination() call to do that. awaitTerminatioin will block until all tasks terminate, this may delay the stop process of other service, should we do that? We should also call "super.serviceStop()" to complete the life-cycle, can you make that change too? will do

          awaitTerminatioin will block until all tasks terminate, this may delay the stop process of other service, should we do that?

          I think, timed awaitTerminatioin can be done for 60 seconds.

          rohithsharma Rohith Sharma K S added a comment - awaitTerminatioin will block until all tasks terminate, this may delay the stop process of other service, should we do that? I think, timed awaitTerminatioin can be done for 60 seconds.
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 32s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 mvninstall 7m 31s trunk passed
          +1 compile 0m 37s trunk passed
          +1 checkstyle 0m 23s trunk passed
          +1 mvnsite 0m 40s trunk passed
          +1 mvneclipse 0m 16s trunk passed
          +1 findbugs 1m 5s trunk passed
          +1 javadoc 0m 26s trunk passed
          +1 mvninstall 0m 36s the patch passed
          +1 compile 0m 34s the patch passed
          +1 javac 0m 34s the patch passed
          +1 checkstyle 0m 19s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 4 unchanged - 3 fixed = 4 total (was 7)
          +1 mvnsite 0m 37s the patch passed
          +1 mvneclipse 0m 13s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 11s the patch passed
          +1 javadoc 0m 22s the patch passed
          -1 unit 39m 37s hadoop-yarn-server-resourcemanager in the patch failed.
          +1 asflicense 0m 18s The patch does not generate ASF License warnings.
          56m 3s



          Reason Tests
          Failed junit tests hadoop.yarn.server.resourcemanager.TestRMRestart



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12816980/YARN-5315.02.patch
          JIRA Issue YARN-5315
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 55e20e72d697 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / da6f1b8
          Default Java 1.8.0_91
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-YARN-Build/12253/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/12253/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/12253/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/12253/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 32s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 7m 31s trunk passed +1 compile 0m 37s trunk passed +1 checkstyle 0m 23s trunk passed +1 mvnsite 0m 40s trunk passed +1 mvneclipse 0m 16s trunk passed +1 findbugs 1m 5s trunk passed +1 javadoc 0m 26s trunk passed +1 mvninstall 0m 36s the patch passed +1 compile 0m 34s the patch passed +1 javac 0m 34s the patch passed +1 checkstyle 0m 19s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 4 unchanged - 3 fixed = 4 total (was 7) +1 mvnsite 0m 37s the patch passed +1 mvneclipse 0m 13s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 11s the patch passed +1 javadoc 0m 22s the patch passed -1 unit 39m 37s hadoop-yarn-server-resourcemanager in the patch failed. +1 asflicense 0m 18s The patch does not generate ASF License warnings. 56m 3s Reason Tests Failed junit tests hadoop.yarn.server.resourcemanager.TestRMRestart Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12816980/YARN-5315.02.patch JIRA Issue YARN-5315 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 55e20e72d697 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / da6f1b8 Default Java 1.8.0_91 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-YARN-Build/12253/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/12253/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/12253/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/12253/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          sandflee sandflee added a comment -

          test failure is not related and tracked by YARN-5037

          sandflee sandflee added a comment - test failure is not related and tracked by YARN-5037
          jianhe Jian He added a comment -

          sandflee, I think your first patch is fine. In the 2nd patch, calling awaitTermination will not help this situation, it only prolongs the stop phase if there were any pending task. Is my understanding right ?

          jianhe Jian He added a comment - sandflee , I think your first patch is fine. In the 2nd patch, calling awaitTermination will not help this situation, it only prolongs the stop phase if there were any pending task. Is my understanding right ?
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 19s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 mvninstall 6m 59s trunk passed
          +1 compile 0m 33s trunk passed
          +1 checkstyle 0m 21s trunk passed
          +1 mvnsite 0m 38s trunk passed
          +1 mvneclipse 0m 17s trunk passed
          +1 findbugs 0m 58s trunk passed
          +1 javadoc 0m 21s trunk passed
          +1 mvninstall 0m 31s the patch passed
          +1 compile 0m 29s the patch passed
          +1 javac 0m 29s the patch passed
          +1 checkstyle 0m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 4 unchanged - 3 fixed = 4 total (was 7)
          +1 mvnsite 0m 36s the patch passed
          +1 mvneclipse 0m 14s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 5s the patch passed
          +1 javadoc 0m 18s the patch passed
          +1 unit 33m 53s hadoop-yarn-server-resourcemanager in the patch passed.
          +1 asflicense 0m 18s The patch does not generate ASF License warnings.
          48m 45s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12816980/YARN-5315.02.patch
          JIRA Issue YARN-5315
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux f1f07c9c2a52 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 14a696f
          Default Java 1.8.0_101
          findbugs v3.0.0
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13209/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13209/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 19s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 6m 59s trunk passed +1 compile 0m 33s trunk passed +1 checkstyle 0m 21s trunk passed +1 mvnsite 0m 38s trunk passed +1 mvneclipse 0m 17s trunk passed +1 findbugs 0m 58s trunk passed +1 javadoc 0m 21s trunk passed +1 mvninstall 0m 31s the patch passed +1 compile 0m 29s the patch passed +1 javac 0m 29s the patch passed +1 checkstyle 0m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 4 unchanged - 3 fixed = 4 total (was 7) +1 mvnsite 0m 36s the patch passed +1 mvneclipse 0m 14s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 5s the patch passed +1 javadoc 0m 18s the patch passed +1 unit 33m 53s hadoop-yarn-server-resourcemanager in the patch passed. +1 asflicense 0m 18s The patch does not generate ASF License warnings. 48m 45s Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12816980/YARN-5315.02.patch JIRA Issue YARN-5315 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux f1f07c9c2a52 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 14a696f Default Java 1.8.0_101 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13209/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13209/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          sandflee sandflee added a comment -

          shutdownNow :
          Attempts to stop all actively executing tasks, halts the processing of waiting tasks, and returns a list of the tasks that were awaiting execution. These tasks are drained (removed) from the task queue upon return from this method.

          thanks jianhe, shutdownNow will interrupt active workers and drain pending task. so the difference is patch1 will not wait active worker terminated but patch2 will. seems we couldn't get much benefit from awaitTermination.

          sandflee sandflee added a comment - shutdownNow : Attempts to stop all actively executing tasks, halts the processing of waiting tasks, and returns a list of the tasks that were awaiting execution. These tasks are drained (removed) from the task queue upon return from this method. thanks jianhe , shutdownNow will interrupt active workers and drain pending task. so the difference is patch1 will not wait active worker terminated but patch2 will. seems we couldn't get much benefit from awaitTermination.
          miklos.szegedi@cloudera.com Miklos Szegedi added a comment -

          Thank you, sandflee!
          +1 (non-binding). It looks good to me. I verified and the change still applies to the current trunk.

          miklos.szegedi@cloudera.com Miklos Szegedi added a comment - Thank you, sandflee ! +1 (non-binding). It looks good to me. I verified and the change still applies to the current trunk.

          People

            sandflee sandflee
            sandflee sandflee
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated: