Details
-
Bug
-
Status: Patch Available
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
1, network partitions, RM couldn't connect to NMs and start AM request pending
2, RM becomes standby, int ApplicatioinMasterLauncher#serviceStop, launcherPool are shutdown. the launching thread are interrupted, but start AM request may still left in Queue
3,network reconnect, standby RM sends start AM request to NM.
Attachments
Attachments
- YARN-5315.01.patch
- 1 kB
- sandflee
- YARN-5315.02.patch
- 1 kB
- sandflee
Activity
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 20s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
-1 | test4tests | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. |
+1 | mvninstall | 7m 4s | trunk passed |
+1 | compile | 0m 33s | trunk passed |
+1 | checkstyle | 0m 21s | trunk passed |
+1 | mvnsite | 0m 38s | trunk passed |
+1 | mvneclipse | 0m 14s | trunk passed |
+1 | findbugs | 1m 0s | trunk passed |
+1 | javadoc | 0m 20s | trunk passed |
+1 | mvninstall | 0m 32s | the patch passed |
+1 | compile | 0m 30s | the patch passed |
+1 | javac | 0m 30s | the patch passed |
+1 | checkstyle | 0m 18s | hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 4 unchanged - 3 fixed = 4 total (was 7) |
+1 | mvnsite | 0m 34s | the patch passed |
+1 | mvneclipse | 0m 11s | the patch passed |
+1 | whitespace | 0m 0s | The patch has no whitespace issues. |
+1 | findbugs | 1m 7s | the patch passed |
+1 | javadoc | 0m 18s | the patch passed |
+1 | unit | 32m 31s | hadoop-yarn-server-resourcemanager in the patch passed. |
+1 | asflicense | 0m 17s | The patch does not generate ASF License warnings. |
47m 27s |
Subsystem | Report/Notes |
---|---|
Docker | Image:yetus/hadoop:9560f25 |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12816436/YARN-5315.01.patch |
JIRA Issue | YARN-5315 |
Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle |
uname | Linux dea944e80d71 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
git revision | trunk / d792a90 |
Default Java | 1.8.0_91 |
findbugs | v3.0.0 |
Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/12198/testReport/ |
modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager |
Console output | https://builds.apache.org/job/PreCommit-YARN-Build/12198/console |
Powered by | Apache Yetus 0.3.0 http://yetus.apache.org |
This message was automatically generated.
shutdownNow() still does not wait for actively executing tasks to terminate. You should make an explicit awaitTermination() call to do that.
We should also call "super.serviceStop()" to complete the life-cycle, can you make that change too?
shutdownNow() still does not wait for actively executing tasks to terminate. You should make an explicit awaitTermination() call to do that.
awaitTerminatioin will block until all tasks terminate, this may delay the stop process of other service, should we do that?
We should also call "super.serviceStop()" to complete the life-cycle, can you make that change too?
will do
awaitTerminatioin will block until all tasks terminate, this may delay the stop process of other service, should we do that?
I think, timed awaitTerminatioin can be done for 60 seconds.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 32s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
-1 | test4tests | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. |
+1 | mvninstall | 7m 31s | trunk passed |
+1 | compile | 0m 37s | trunk passed |
+1 | checkstyle | 0m 23s | trunk passed |
+1 | mvnsite | 0m 40s | trunk passed |
+1 | mvneclipse | 0m 16s | trunk passed |
+1 | findbugs | 1m 5s | trunk passed |
+1 | javadoc | 0m 26s | trunk passed |
+1 | mvninstall | 0m 36s | the patch passed |
+1 | compile | 0m 34s | the patch passed |
+1 | javac | 0m 34s | the patch passed |
+1 | checkstyle | 0m 19s | hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 4 unchanged - 3 fixed = 4 total (was 7) |
+1 | mvnsite | 0m 37s | the patch passed |
+1 | mvneclipse | 0m 13s | the patch passed |
+1 | whitespace | 0m 0s | The patch has no whitespace issues. |
+1 | findbugs | 1m 11s | the patch passed |
+1 | javadoc | 0m 22s | the patch passed |
-1 | unit | 39m 37s | hadoop-yarn-server-resourcemanager in the patch failed. |
+1 | asflicense | 0m 18s | The patch does not generate ASF License warnings. |
56m 3s |
Reason | Tests |
---|---|
Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
Subsystem | Report/Notes |
---|---|
Docker | Image:yetus/hadoop:9560f25 |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12816980/YARN-5315.02.patch |
JIRA Issue | YARN-5315 |
Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle |
uname | Linux 55e20e72d697 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
git revision | trunk / da6f1b8 |
Default Java | 1.8.0_91 |
findbugs | v3.0.0 |
unit | https://builds.apache.org/job/PreCommit-YARN-Build/12253/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt |
unit test logs | https://builds.apache.org/job/PreCommit-YARN-Build/12253/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt |
Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/12253/testReport/ |
modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager |
Console output | https://builds.apache.org/job/PreCommit-YARN-Build/12253/console |
Powered by | Apache Yetus 0.3.0 http://yetus.apache.org |
This message was automatically generated.
sandflee, I think your first patch is fine. In the 2nd patch, calling awaitTermination will not help this situation, it only prolongs the stop phase if there were any pending task. Is my understanding right ?
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 19s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
-1 | test4tests | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. |
+1 | mvninstall | 6m 59s | trunk passed |
+1 | compile | 0m 33s | trunk passed |
+1 | checkstyle | 0m 21s | trunk passed |
+1 | mvnsite | 0m 38s | trunk passed |
+1 | mvneclipse | 0m 17s | trunk passed |
+1 | findbugs | 0m 58s | trunk passed |
+1 | javadoc | 0m 21s | trunk passed |
+1 | mvninstall | 0m 31s | the patch passed |
+1 | compile | 0m 29s | the patch passed |
+1 | javac | 0m 29s | the patch passed |
+1 | checkstyle | 0m 18s | hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 4 unchanged - 3 fixed = 4 total (was 7) |
+1 | mvnsite | 0m 36s | the patch passed |
+1 | mvneclipse | 0m 14s | the patch passed |
+1 | whitespace | 0m 0s | The patch has no whitespace issues. |
+1 | findbugs | 1m 5s | the patch passed |
+1 | javadoc | 0m 18s | the patch passed |
+1 | unit | 33m 53s | hadoop-yarn-server-resourcemanager in the patch passed. |
+1 | asflicense | 0m 18s | The patch does not generate ASF License warnings. |
48m 45s |
Subsystem | Report/Notes |
---|---|
Docker | Image:yetus/hadoop:9560f25 |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12816980/YARN-5315.02.patch |
JIRA Issue | YARN-5315 |
Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle |
uname | Linux f1f07c9c2a52 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
git revision | trunk / 14a696f |
Default Java | 1.8.0_101 |
findbugs | v3.0.0 |
Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/13209/testReport/ |
modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager |
Console output | https://builds.apache.org/job/PreCommit-YARN-Build/13209/console |
Powered by | Apache Yetus 0.3.0 http://yetus.apache.org |
This message was automatically generated.
shutdownNow :
Attempts to stop all actively executing tasks, halts the processing of waiting tasks, and returns a list of the tasks that were awaiting execution. These tasks are drained (removed) from the task queue upon return from this method.
thanks jianhe, shutdownNow will interrupt active workers and drain pending task. so the difference is patch1 will not wait active worker terminated but patch2 will. seems we couldn't get much benefit from awaitTermination.
Thank you, sandflee!
+1 (non-binding). It looks good to me. I verified and the change still applies to the current trunk.
Good catch!! +lgtm..