TestMRJobs#testThreadDumpOnTaskTimeout has been failing sporadically recently. When the AM times out a task it immediately removes it from the list of known tasks and then connects to the NM to request a thread dump followed by a kill. If the task heartbeats in after the task has been removed from the list of known tasks but before the thread dump signal arrives then the task can exit with a "org.apache.hadoop.mapred.Task: Parent died." message and no thread dump.
- is broken by
-
MAPREDUCE-5124 AM lacks flow control for task events
-
- Resolved
-