Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.9.1
-
None
-
None
Description
the stack trace--
"main":
at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.createStatus(TaskTracker.java:880)
- waiting to lock <0xea101658> (a org.apache.hadoop.mapred.TaskTracker$TaskInProgress)
at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:489) - locked <0x75505f00> (a org.apache.hadoop.mapred.TaskTracker)
at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:442)
at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:720)
at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1374)
"taskCleanup":
at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.cleanup(TaskTracker.java:1072) - waiting to lock <0x75505f00> (a org.apache.hadoop.mapred.TaskTracker)
at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.jobHasFinished(TaskTracker.java:1013) - locked <0xea101658> (a org.apache.hadoop.mapred.TaskTracker$TaskInProgress)
at org.apache.hadoop.mapred.TaskTracker$1.run(TaskTracker.java:144)
at java.lang.Thread.run(Thread.java:595)
Found 1 deadlock.
The jobhasfinished method and transmitHeart beat lock the tasktracker and tip in a different order. Also , before emitting HeartBeat we should be updating the status and removing entries from runningtasks. Currently this is done after the heartbeat.