Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
Reviewed
Description
Job history server will try to create a history directory in HDFS on startup. When NameNode is in safe mode, it will keep retrying for a configurable time period. However, it should also keeps retrying if the name node is in start state. Safe mode does not happen until the NN is out of the startup phase.
A RetriableException with the text "NameNode still not started" is thrown when the NN is in its internal service startup phase. We should add the check for this specific exception in isBecauseSafeMode() to account for that.
Attachments
Attachments
- mapreduce6657.007.patch
- 9 kB
- Haibo Chen
- mapreduce6657.006.patch
- 6 kB
- Haibo Chen
- mapreduce6657.005.patch
- 6 kB
- Haibo Chen
- mapreduce6657.004.patch
- 5 kB
- Haibo Chen
- mapreduce6657.003.patch
- 4 kB
- Haibo Chen
- mapreduce6657.002.patch
- 5 kB
- Haibo Chen
- mapreduce6657.001.patch
- 5 kB
- Haibo Chen
Activity
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 15s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
+1 | mvninstall | 7m 36s | trunk passed |
+1 | compile | 0m 21s | trunk passed with JDK v1.8.0_74 |
+1 | compile | 0m 19s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 13s | trunk passed |
+1 | mvnsite | 0m 25s | trunk passed |
+1 | mvneclipse | 0m 13s | trunk passed |
+1 | findbugs | 0m 35s | trunk passed |
+1 | javadoc | 0m 17s | trunk passed with JDK v1.8.0_74 |
+1 | javadoc | 0m 17s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 20s | the patch passed |
+1 | compile | 0m 18s | the patch passed with JDK v1.8.0_74 |
+1 | javac | 0m 18s | the patch passed |
+1 | compile | 0m 17s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 17s | the patch passed |
-1 | checkstyle | 0m 11s | hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs: patch generated 1 new + 16 unchanged - 0 fixed = 17 total (was 16) |
+1 | mvnsite | 0m 23s | the patch passed |
+1 | mvneclipse | 0m 12s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 0m 47s | the patch passed |
+1 | javadoc | 0m 14s | the patch passed with JDK v1.8.0_74 |
+1 | javadoc | 0m 14s | the patch passed with JDK v1.7.0_95 |
+1 | unit | 6m 28s | hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_74. |
+1 | unit | 6m 19s | hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. |
+1 | asflicense | 0m 19s | Patch does not generate ASF License warnings. |
27m 32s |
Subsystem | Report/Notes |
---|---|
Docker | Image:yetus/hadoop:fbe3e86 |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12795338/mapreduce6657.001.patch |
JIRA Issue | |
Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle |
uname | Linux a0caa41ec4e4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
git revision | trunk / 2e1d0ff |
Default Java | 1.7.0_95 |
Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 |
findbugs | v3.0.0 |
checkstyle | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6394/artifact/patchprocess/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-hs.txt |
JDK v1.7.0_95 Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6394/testReport/ |
modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs |
Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6394/console |
Powered by | Apache Yetus 0.2.0 http://yetus.apache.org |
This message was automatically generated.
+1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 15s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
+1 | mvninstall | 6m 41s | trunk passed |
+1 | compile | 0m 15s | trunk passed with JDK v1.8.0_74 |
+1 | compile | 0m 18s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 14s | trunk passed |
+1 | mvnsite | 0m 23s | trunk passed |
+1 | mvneclipse | 0m 13s | trunk passed |
+1 | findbugs | 0m 32s | trunk passed |
+1 | javadoc | 0m 13s | trunk passed with JDK v1.8.0_74 |
+1 | javadoc | 0m 15s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 20s | the patch passed |
+1 | compile | 0m 13s | the patch passed with JDK v1.8.0_74 |
+1 | javac | 0m 13s | the patch passed |
+1 | compile | 0m 16s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 16s | the patch passed |
+1 | checkstyle | 0m 12s | the patch passed |
+1 | mvnsite | 0m 21s | the patch passed |
+1 | mvneclipse | 0m 11s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 0m 42s | the patch passed |
+1 | javadoc | 0m 11s | the patch passed with JDK v1.8.0_74 |
+1 | javadoc | 0m 13s | the patch passed with JDK v1.7.0_95 |
+1 | unit | 5m 49s | hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_74. |
+1 | unit | 6m 7s | hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. |
+1 | asflicense | 0m 17s | Patch does not generate ASF License warnings. |
25m 6s |
Subsystem | Report/Notes |
---|---|
Docker | Image:yetus/hadoop:fbe3e86 |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12795430/mapreduce6657.002.patch |
JIRA Issue | |
Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle |
uname | Linux 636e1ce9949f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
git revision | trunk / 2c268cc |
Default Java | 1.7.0_95 |
Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 |
findbugs | v3.0.0 |
JDK v1.7.0_95 Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6397/testReport/ |
modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs |
Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6397/console |
Powered by | Apache Yetus 0.2.0 http://yetus.apache.org |
This message was automatically generated.
Thanks for the patch, haibochen.
I hate that HDFS expects you to parse the text of their exceptions to figure out what's going on. Wanna look into whether the API would allow you to throw a properly typed exception? Maybe just file a followup JIRA?
In your test code, it would be nice to add a javadoc header that explains what you're testing.
I don't love that you're running two mini-clusters and ignoring one of them. Is there any way to do the test with the existing mini-cluster without disrupting the other tests? If not, I'd consider creating a new test class so that you don't have two mini-clusters running.
Is 2000ms the shortest reasonable duration for the timeout? Seems long to me...
Assert.assertEquals("Job History Server is expected to time out.",
Your assert message is misleading. It should instead say that it didn't get the expected error message.
Thanks a lot for you comments, templedf I have added a brief javadoc and made the timeout to be 500. Let me know if 500 looks reasonable to you.
Also, the test method is now using the existing dfs cluster instead of a new local one. The only method in TestHistoryManager that is using is both dfs clusters is testCreateDirsWithAdditionalFileSystem(), so maybe it makes more sense to move that method out?
The behavior of JHS, when name node is in safe mode, is that it throws a YarnRuntimeException with a timeout message. I think the assert message is actually in line with the expected behavior.
The message says that the server should have timed out, but the assert is testing whether the exception message is correct when it does time out. If it doesn't time out, it looks to me like the test will pass. You should probably also have an Assert.fail() after the serviceInit() call.
updated the test method according to templedf's comments, and moved it to a new test class because it cannot share clusters with other test methods in TestHistoryFileManager.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 16s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 2 new or modified test files. |
0 | mvndep | 0m 11s | Maven dependency ordering for branch |
+1 | mvninstall | 7m 31s | trunk passed |
+1 | compile | 2m 9s | trunk passed with JDK v1.8.0_77 |
+1 | compile | 2m 14s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 29s | trunk passed |
+1 | mvnsite | 1m 6s | trunk passed |
+1 | mvneclipse | 0m 32s | trunk passed |
+1 | findbugs | 1m 26s | trunk passed |
+1 | javadoc | 0m 35s | trunk passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 41s | trunk passed with JDK v1.7.0_95 |
0 | mvndep | 0m 13s | Maven dependency ordering for patch |
+1 | mvninstall | 0m 55s | the patch passed |
+1 | compile | 1m 40s | the patch passed with JDK v1.8.0_77 |
+1 | javac | 2m 49s | hadoop-mapreduce-project_hadoop-mapreduce-client-jdk1.8.0_77 with JDK v1.8.0_77 generated 0 new + 356 unchanged - 6 fixed = 356 total (was 362) |
+1 | javac | 1m 40s | hadoop-mapreduce-client in the patch passed with JDK v1.8.0_77. |
+1 | compile | 1m 53s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 4m 42s | hadoop-mapreduce-project_hadoop-mapreduce-client-jdk1.7.0_95 with JDK v1.7.0_95 generated 0 new + 361 unchanged - 6 fixed = 361 total (was 367) |
+1 | javac | 1m 53s | hadoop-mapreduce-client in the patch passed with JDK v1.7.0_95. |
+1 | checkstyle | 0m 25s | the patch passed |
+1 | mvnsite | 0m 59s | the patch passed |
+1 | mvneclipse | 0m 25s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 1m 42s | the patch passed |
+1 | javadoc | 0m 28s | the patch passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 31s | the patch passed with JDK v1.7.0_95 |
+1 | unit | 9m 42s | hadoop-mapreduce-client-app in the patch passed with JDK v1.8.0_77. |
-1 | unit | 108m 25s | hadoop-mapreduce-client-jobclient in the patch failed with JDK v1.8.0_77. |
+1 | unit | 10m 19s | hadoop-mapreduce-client-app in the patch passed with JDK v1.7.0_95. |
-1 | unit | 109m 5s | hadoop-mapreduce-client-jobclient in the patch failed with JDK v1.7.0_95. |
+1 | asflicense | 0m 32s | Patch does not generate ASF License warnings. |
265m 49s |
Reason | Tests |
---|---|
JDK v1.8.0_77 Failed junit tests | hadoop.mapred.TestMRCJCFileOutputCommitter |
JDK v1.7.0_95 Failed junit tests | hadoop.mapred.TestMRCJCFileOutputCommitter |
This message was automatically generated.
Thanks, haibochen. Some comments:
* HDFS is not running normally (either in start phrase or
should be
* HDFS is not running normally (either in start phase or
private static final String CLUSTER_BASE_DIR = MiniDFSCluster.getBaseDirectory(); ... conf.set(MiniDFSCluster.HDFS_MINIDFS_BASEDIR, CLUSTER_BASE_DIR.substring(0, CLUSTER_BASE_DIR.length() - 1) + "_safemode");
Why not just set the base dir to what you want initially?
final long maxJHSWaitTime = 500;
Tiny quibble: the name should probably be maxJhsWaitTime. We have conflicting styles in the code, but IIRC the style guide says to only capitalize the first letter of acronyms in names. (I could be wrong, so feel free to call my bluff.)
dfsCluster.getFileSystem().setSafeMode( HdfsConstants.SafeModeAction.SAFEMODE_ENTER); Assert.assertTrue(dfsCluster.getFileSystem().isInSafeMode());
To be completely safe these lines should be inside the try.
Assert.assertEquals("Job History Server is expected to be " +
expectedExceptionMsg, expectedExceptionMsg, yex.getMessage());
should probably be more like
Assert.assertEquals("Unexpected reconnect timeout exception message",
expectedExceptionMsg, yex.getMessage());
The assert will include the expected value in the output.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 23s | Docker mode activated. |
+1 | @author | 0m 1s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
+1 | mvninstall | 8m 37s | trunk passed |
+1 | compile | 0m 23s | trunk passed with JDK v1.8.0_77 |
+1 | compile | 0m 21s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 16s | trunk passed |
+1 | mvnsite | 0m 28s | trunk passed |
+1 | mvneclipse | 0m 15s | trunk passed |
+1 | findbugs | 0m 42s | trunk passed |
+1 | javadoc | 0m 18s | trunk passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 18s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 22s | the patch passed |
+1 | compile | 0m 19s | the patch passed with JDK v1.8.0_77 |
+1 | javac | 0m 19s | the patch passed |
+1 | compile | 0m 19s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 19s | the patch passed |
+1 | checkstyle | 0m 13s | the patch passed |
+1 | mvnsite | 0m 26s | the patch passed |
+1 | mvneclipse | 0m 13s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 0m 52s | the patch passed |
+1 | javadoc | 0m 15s | the patch passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 17s | the patch passed with JDK v1.7.0_95 |
+1 | unit | 6m 57s | hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_77. |
+1 | unit | 6m 56s | hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. |
-1 | asflicense | 0m 21s | Patch generated 1 ASF License warnings. |
30m 43s |
Subsystem | Report/Notes |
---|---|
Docker | Image:yetus/hadoop:fbe3e86 |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12799789/mapreduce6657.003.patch |
JIRA Issue | |
Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle |
uname | Linux ce566e720948 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
git revision | trunk / ad36fa6 |
Default Java | 1.7.0_95 |
Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_77 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 |
findbugs | v3.0.0 |
JDK v1.7.0_95 Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6450/testReport/ |
asflicense | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6450/artifact/patchprocess/patch-asflicense-problems.txt |
modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs |
Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6450/console |
Powered by | Apache Yetus 0.2.0 http://yetus.apache.org |
This message was automatically generated.
Thanks templedf and @Ray Chiang for your reviews. I have updated my patch accordingly.
+1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 17s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
+1 | mvninstall | 6m 37s | trunk passed |
+1 | compile | 0m 15s | trunk passed with JDK v1.8.0_77 |
+1 | compile | 0m 18s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 14s | trunk passed |
+1 | mvnsite | 0m 22s | trunk passed |
+1 | mvneclipse | 0m 13s | trunk passed |
+1 | findbugs | 0m 32s | trunk passed |
+1 | javadoc | 0m 13s | trunk passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 17s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 18s | the patch passed |
+1 | compile | 0m 13s | the patch passed with JDK v1.8.0_77 |
+1 | javac | 0m 13s | the patch passed |
+1 | compile | 0m 15s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 15s | the patch passed |
+1 | checkstyle | 0m 11s | the patch passed |
+1 | mvnsite | 0m 20s | the patch passed |
+1 | mvneclipse | 0m 10s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 0m 40s | the patch passed |
+1 | javadoc | 0m 10s | the patch passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 13s | the patch passed with JDK v1.7.0_95 |
+1 | unit | 5m 49s | hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_77. |
+1 | unit | 6m 9s | hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. |
+1 | asflicense | 0m 17s | Patch does not generate ASF License warnings. |
24m 59s |
Subsystem | Report/Notes |
---|---|
Docker | Image:yetus/hadoop:fbe3e86 |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12799883/mapreduce6657.004.patch |
JIRA Issue | |
Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle |
uname | Linux e11038ba9f14 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
git revision | trunk / 1e48eef |
Default Java | 1.7.0_95 |
Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_77 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 |
findbugs | v3.0.0 |
JDK v1.7.0_95 Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6452/testReport/ |
modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs |
Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6452/console |
Powered by | Apache Yetus 0.2.0 http://yetus.apache.org |
This message was automatically generated.
Looks good, haibochen. One thing I just noticed, though, is that you need to update the comment on HistoryFileManager.isBecauseSafeMode() to reflect the new behavior. You might consider renaming the method as well, since the name is also no longer exactly accurate.
Maybe rename it to "checkNameNodeNotStartedYet" or just "checkNameNodeNotStarted" ?
Can we make the method name should be something more like isNameNodeUnavailable() or isNameNodeNotReady() so that it's more inclusive of both safe mode and slow starts? Not a deal breaker...
isNameNodeUnavailable() might be a little too broad as it covers cases where name node can start up and then becomes unavailable, in which JobHistoryServer just throws an Error during initialization as indicated by the current code. I guess the question then becomes do we want JHS to keep retrying even when cases where name node is already started but unavailable.
+1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 10s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
+1 | mvninstall | 9m 14s | trunk passed |
+1 | compile | 0m 23s | trunk passed with JDK v1.8.0_92 |
+1 | compile | 0m 22s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 17s | trunk passed |
+1 | mvnsite | 0m 28s | trunk passed |
+1 | mvneclipse | 0m 16s | trunk passed |
+1 | findbugs | 0m 44s | trunk passed |
+1 | javadoc | 0m 19s | trunk passed with JDK v1.8.0_92 |
+1 | javadoc | 0m 20s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 23s | the patch passed |
+1 | compile | 0m 21s | the patch passed with JDK v1.8.0_92 |
+1 | javac | 0m 21s | the patch passed |
+1 | compile | 0m 20s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 20s | the patch passed |
+1 | checkstyle | 0m 15s | the patch passed |
+1 | mvnsite | 0m 28s | the patch passed |
+1 | mvneclipse | 0m 15s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 0m 52s | the patch passed |
+1 | javadoc | 0m 18s | the patch passed with JDK v1.8.0_92 |
+1 | javadoc | 0m 17s | the patch passed with JDK v1.7.0_95 |
+1 | unit | 7m 14s | hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_92. |
+1 | unit | 7m 8s | hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. |
+1 | asflicense | 0m 23s | Patch does not generate ASF License warnings. |
32m 0s |
Subsystem | Report/Notes |
---|---|
Docker | Image:yetus/hadoop:fbe3e86 |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12800866/mapreduce6657.005.patch |
JIRA Issue | |
Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle |
uname | Linux fb4684770a0c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
git revision | trunk / 6be22dd |
Default Java | 1.7.0_95 |
Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_92 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 |
findbugs | v3.0.0 |
JDK v1.7.0_95 Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6462/testReport/ |
modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs |
Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6462/console |
Powered by | Apache Yetus 0.2.0 http://yetus.apache.org |
This message was automatically generated.
Thanks haibochen for the patch.
The hard code of checking message string is very flaky:
+ return ex.toString().contains("SafeModeException") || + (ex instanceof RetriableException && ex.getMessage().contains( + "NameNode still not started"));
If HDFS in future change exception message to something else. i.e. "Namenode not start yet.", then the issue will come up again. Instead, we should define the message as a static string.
Other looks fine.
Thanks for updating the patch, haibochen.
My above comments is actually trying to say we should define static string in where exception get throw.
In this case, we should also change NameNodeRpcServer.java:
private void checkNNStartup() throws IOException { if (!this.nn.isStarted()) { throw new RetriableException(this.nn.getRole() + " still not started"); } }
If we define some static string in HDFS and use in both side (NameNodeRpcServer and HistoryFileManager), that can make sure we won't hit this issue again in future if we update exception string.
Sorry for misunderstanding your previous comments. Do you think we should create a subclass of RetriableException for this instead? junping_du The message is derived from a instance method this.nn.getRole(), and doing string matching is probably not the cleanest way. If so, I can create file a follow-up jira in HDFS and update isNameNodeStillNotStarted() when we have the new 'NameNodeNotStartedException' that extends RetriableException.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 12s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
+1 | mvninstall | 6m 50s | trunk passed |
+1 | compile | 0m 15s | trunk passed with JDK v1.8.0_91 |
+1 | compile | 0m 18s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 14s | trunk passed |
+1 | mvnsite | 0m 22s | trunk passed |
+1 | mvneclipse | 0m 13s | trunk passed |
+1 | findbugs | 0m 33s | trunk passed |
+1 | javadoc | 0m 12s | trunk passed with JDK v1.8.0_91 |
+1 | javadoc | 0m 16s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 18s | the patch passed |
+1 | compile | 0m 13s | the patch passed with JDK v1.8.0_91 |
+1 | javac | 0m 13s | the patch passed |
+1 | compile | 0m 15s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 15s | the patch passed |
-1 | checkstyle | 0m 13s | hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs: patch generated 2 new + 16 unchanged - 0 fixed = 18 total (was 16) |
+1 | mvnsite | 0m 20s | the patch passed |
+1 | mvneclipse | 0m 10s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 0m 41s | the patch passed |
+1 | javadoc | 0m 11s | the patch passed with JDK v1.8.0_91 |
+1 | javadoc | 0m 13s | the patch passed with JDK v1.7.0_95 |
+1 | unit | 6m 3s | hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_91. |
+1 | unit | 6m 27s | hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. |
+1 | asflicense | 0m 19s | Patch does not generate ASF License warnings. |
25m 46s |
Subsystem | Report/Notes |
---|---|
Docker | Image:yetus/hadoop:cf2ee45 |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12800866/mapreduce6657.005.patch |
JIRA Issue | |
Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle |
uname | Linux 95c33ef8963a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
git revision | trunk / 687233f |
Default Java | 1.7.0_95 |
Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 |
findbugs | v3.0.0 |
checkstyle | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6493/artifact/patchprocess/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-hs.txt |
JDK v1.7.0_95 Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6493/testReport/ |
modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs |
Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6493/console |
Powered by | Apache Yetus 0.2.0 http://yetus.apache.org |
This message was automatically generated.
Do you think we should create a subclass of RetriableException for this instead?
It is up to you. IMO, it is not necessary to do so just for a special case or it could be too many sub-exceptions.
The message is derived from a instance method this.nn.getRole(), and doing string matching is probably not the cleanest way.
You can make a static method for this.nn.getRole() + " still not started" with input of daemon's name ("NameNode" here) which is accessible from both HDFS and MAPREDUCE (JHS). In JHS, just put "NameNode" (or move NamenodeRole from HdfsServerConstants to HdfsConstants and share to JHS) and get the same string with HDFS. That could be much cleaner.
Thanks a lot for your remarks, junping_du. For this jira, I will keep using string comparison given that the issue you pointed out is more of a HDFS issue. Will file a follow up HDFS jira to fix it. Uploaded a patch that is the same as mapreduce.005.patch but with the checkstyle fix.
haibochen, thanks. As long as you file the follow-up JIRA, I'm fine with that. I've encountered the same issue with HDFS exception handling before and dealt with it the same way. Fixing HDFS is out of scope for this change.
I chatted with haibochen. I think the HDFS change is trivial and it would be silly for us to fix this with the 006 patch, but then have to do it again when the HDFS JIRA is done. May as well do it now.
+1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 23s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
+1 | mvninstall | 8m 32s | trunk passed |
+1 | compile | 0m 25s | trunk passed with JDK v1.8.0_91 |
+1 | compile | 0m 21s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 18s | trunk passed |
+1 | mvnsite | 0m 27s | trunk passed |
+1 | mvneclipse | 0m 16s | trunk passed |
+1 | findbugs | 0m 44s | trunk passed |
+1 | javadoc | 0m 21s | trunk passed with JDK v1.8.0_91 |
+1 | javadoc | 0m 17s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 23s | the patch passed |
+1 | compile | 0m 23s | the patch passed with JDK v1.8.0_91 |
+1 | javac | 0m 23s | the patch passed |
+1 | compile | 0m 20s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 20s | the patch passed |
+1 | checkstyle | 0m 16s | the patch passed |
+1 | mvnsite | 0m 25s | the patch passed |
+1 | mvneclipse | 0m 13s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 0m 56s | the patch passed |
+1 | javadoc | 0m 10s | the patch passed with JDK v1.8.0_91 |
+1 | javadoc | 0m 13s | the patch passed with JDK v1.7.0_95 |
+1 | unit | 6m 5s | hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_91. |
+1 | unit | 6m 19s | hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. |
+1 | asflicense | 0m 21s | Patch does not generate ASF License warnings. |
29m 9s |
Subsystem | Report/Notes |
---|---|
Docker | Image:yetus/hadoop:cf2ee45 |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12803913/mapreduce6657.006.patch |
JIRA Issue | |
Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle |
uname | Linux 29293b042b44 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
git revision | trunk / 1f2794b |
Default Java | 1.7.0_95 |
Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 |
findbugs | v3.0.0 |
JDK v1.7.0_95 Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6497/testReport/ |
modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs |
Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6497/console |
Powered by | Apache Yetus 0.2.0 http://yetus.apache.org |
This message was automatically generated.
I think the HDFS change is trivial and it would be silly for us to fix this with the 006 patch, but then have to do it again when the HDFS JIRA is done. May as well do it now.
Agree. The change on HDFS is relatively trivial and lower risky. No necessary to have a separated JIRA.
Updated the patch with Junping's comments on adding a static method for this.nn.getRole() + " still not started".
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 14s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
0 | mvndep | 0m 16s | Maven dependency ordering for branch |
+1 | mvninstall | 7m 56s | trunk passed |
+1 | compile | 7m 3s | trunk passed with JDK v1.8.0_91 |
+1 | compile | 7m 38s | trunk passed with JDK v1.7.0_101 |
+1 | checkstyle | 1m 35s | trunk passed |
+1 | mvnsite | 1m 21s | trunk passed |
+1 | mvneclipse | 0m 29s | trunk passed |
+1 | findbugs | 2m 42s | trunk passed |
+1 | javadoc | 1m 23s | trunk passed with JDK v1.8.0_91 |
+1 | javadoc | 2m 21s | trunk passed with JDK v1.7.0_101 |
0 | mvndep | 0m 17s | Maven dependency ordering for patch |
+1 | mvninstall | 1m 19s | the patch passed |
+1 | compile | 6m 55s | the patch passed with JDK v1.8.0_91 |
+1 | javac | 6m 55s | the patch passed |
+1 | compile | 6m 40s | the patch passed with JDK v1.7.0_101 |
+1 | javac | 6m 40s | the patch passed |
+1 | checkstyle | 1m 25s | the patch passed |
+1 | mvnsite | 1m 13s | the patch passed |
+1 | mvneclipse | 0m 27s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | xml | 0m 1s | The patch has no ill-formed XML file. |
+1 | findbugs | 2m 52s | the patch passed |
+1 | javadoc | 1m 17s | the patch passed with JDK v1.8.0_91 |
+1 | javadoc | 2m 2s | the patch passed with JDK v1.7.0_101 |
-1 | unit | 62m 23s | hadoop-hdfs in the patch failed with JDK v1.8.0_91. |
+1 | unit | 6m 12s | hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_91. |
-1 | unit | 57m 22s | hadoop-hdfs in the patch failed with JDK v1.7.0_101. |
+1 | unit | 6m 24s | hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_101. |
+1 | asflicense | 0m 28s | Patch does not generate ASF License warnings. |
192m 34s |
Reason | Tests |
---|---|
JDK v1.8.0_91 Failed junit tests | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
JDK v1.7.0_101 Failed junit tests | hadoop.hdfs.TestCrcCorruption |
hadoop.hdfs.TestAsyncDFSRename |
This message was automatically generated.
The test failure is not related.
007 patch LGTM. +1. Will commit it shortly if no further comments from others.
I have commit the patch to trunk and branch-2. Thanks haibochen for contributing the patch and templedf and rkanter for review and comments!
SUCCESS: Integrated in Hadoop-trunk-Commit #9807 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9807/)
MAPREDUCE-6657. Job history server can fail on startup when NameNode is (junping_du: rev f6ef876fe158a5334cad7075f1966573a1c4dec9)
- hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/pom.xml
- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
- hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestHistoryFileManagerInitWithNonRunningDFS.java
- hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java
- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
Added a unit test for SafeModeException in Job History File Manager, and manually test the case when Name Node is in start phase because it happens when multiple threads run in certain order in NameNode (namely, name node initialization thread, and RPC server thread)