Details
-
Bug
-
Status: Closed
-
Blocker
-
Resolution: Cannot Reproduce
-
0.18.0
-
None
-
None
-
None
Description
A few reduces got stuck in a sort500 job with the following thread dump:
"main" prio=10 tid=0x0805b800 nid=0x1951 waiting for monitor entry [0xf7e6d000..0xf7e6e1f8] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:2485) - waiting to lock <0xe905e8f8> (a java.util.LinkedList) - locked <0xe905e928> (a org.apache.hadoop.dfs.DFSClient$DFSOutputStream) at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:155) at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132) - locked <0xe905e928> (a org.apache.hadoop.dfs.DFSClient$DFSOutputStream) at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:121) - locked <0xe905e928> (a org.apache.hadoop.dfs.DFSClient$DFSOutputStream) at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:58) - locked <0xe905e928> (a org.apache.hadoop.dfs.DFSClient$DFSOutputStream) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:39) at java.io.DataOutputStream.writeInt(DataOutputStream.java:181) at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1014) - locked <0xe90889e8> (a org.apache.hadoop.io.SequenceFile$Writer) at org.apache.hadoop.mapred.SequenceFileOutputFormat$1.write(SequenceFileOutputFormat.java:70) at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:298) at org.apache.hadoop.mapred.lib.IdentityReducer.reduce(IdentityReducer.java:39) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:316) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2157) "DataStreamer for file /rw/out/_temporary/_attempt_200806261801_0006_r_000712_0/part-00712 block blk_-3923696991063961587_9628" daemon prio=10 tid=0x08413c00 nid=0x367a in Object.wait() [0xd00e4000..0xd00e4f20] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:701) - locked <0xf167d540> (a org.apache.hadoop.ipc.Client$Call) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) at org.apache.hadoop.dfs.$Proxy2.recoverBlock(Unknown Source) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2186) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1737) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1891) - locked <0xe905e8f8> (a java.util.LinkedList)
Attachments
Issue Links
- is related to
-
HADOOP-3673 Deadlock in Datanode RPC servers
- Closed