Details
Description
The testcase TestUberAM#testThreadDumpOnTaskTimeout was supposed to be fixed by MAPREDUCE-7020. However, it still fails, see: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7325/testReport/junit/org.apache.hadoop.mapreduce.v2/TestMRJobs/testThreadDumpOnTaskTimeout/ (note: other tests failed as well, but those look unrelated).
When I tried to reproduce it locally, it failed again, although with a slightly different error message (it was actually the same as before):
[INFO] ------------------------------------------------------- [INFO] T E S T S [INFO] ------------------------------------------------------- [INFO] Running org.apache.hadoop.mapreduce.v2.TestUberAM [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 128.192 s <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestUberAM [ERROR] testThreadDumpOnTaskTimeout(org.apache.hadoop.mapreduce.v2.TestUberAM) Time elapsed: 79.539 s <<< FAILURE! java.lang.AssertionError: No AppMaster log found! expected:<1> but was:<2> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.apache.hadoop.mapreduce.v2.TestMRJobs.testThreadDumpOnTaskTimeout(TestMRJobs.java:1228) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
Root cause: System.exit() is still invoked at Task.statusUpdate()
public void statusUpdate(TaskUmbilicalProtocol umbilical) throws IOException { int retries = MAX_RETRIES; while (true) { try { if (!umbilical.statusUpdate(getTaskID(), taskStatus).getTaskFound()) { LOG.warn("Parent died. Exiting "+taskId); System.exit(66); } taskStatus.clearStatus(); return; ...
At this point, the task was not found and return value of umbilical.statusUpdate() is false. Checking whether we run in uber mode seems to solve the problem.