Description
Hi David,
Our nightly benchmarks are occasionally failing (2 to 4 of them per night) due to this deadlock in the JT that looks to be caused by Simon. Do you have time to fix this in the morning?
Thanks,
Nige
Found one Java-level deadlock:
=============================
"expireLaunchingTasks":
waiting to lock monitor 0x08141b44 (object 0x57eafdd0, a org.apache.hadoop.mapred.JobTracker),
which is held by "IPC Server handler 8 on 50020"
"IPC Server handler 8 on 50020":
waiting to lock monitor 0x08141630 (object 0x57de46b8, a com.yahoo.simon.hadoop.metrics.SimonContext),
which is held by "Timer-0"
"Timer-0":
waiting to lock monitor 0x08141b44 (object 0x57eafdd0, a org.apache.hadoop.mapred.JobTracker),
which is held by "IPC Server handler 8 on 50020"
Java stack information for the threads listed above:
===================================================
"expireLaunchingTasks":
at org.apache.hadoop.mapred.JobTracker$ExpireLaunchingTasks.run(JobTracker.java:152)
- waiting to lock <0x57eafdd0> (a org.apache.hadoop.mapred.JobTracker)
at java.lang.Thread.run(Thread.java:619)
"IPC Server handler 8 on 50020":
at org.apache.hadoop.metrics.spi.AbstractMetricsContext.createRecord(AbstractMetricsContext.java:192) - waiting to lock <0x57de46b8> (a com.yahoo.simon.hadoop.metrics.SimonContext)
at org.apache.hadoop.mapred.JobInProgress.<init>(JobInProgress.java:130)
at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1383) - locked <0x57eafdd0> (a org.apache.hadoop.mapred.JobTracker)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:336)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:559)
"Timer-0":
at org.apache.hadoop.mapred.JobTracker.getRunningJobs(JobTracker.java:943) - waiting to lock <0x57eafdd0> (a org.apache.hadoop.mapred.JobTracker)
at org.apache.hadoop.mapred.JobTracker$JobTrackerMetrics.doUpdates(JobTracker.java:429)
at org.apache.hadoop.metrics.spi.AbstractMetricsContext.timerEvent(AbstractMetricsContext.java:275) - locked <0x57de46b8> (a com.yahoo.simon.hadoop.metrics.SimonContext)
at org.apache.hadoop.metrics.spi.AbstractMetricsContext.access$000(AbstractMetricsContext.java:48)
at org.apache.hadoop.metrics.spi.AbstractMetricsContext$1.run(AbstractMetricsContext.java:242)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
Found 1 deadlock.