|
Is it possible that this is a configuration issue?
Main concern is, of course, the loss of data blocks. So far they could be recovered by restarting the dead datanodes. When was the jstack you attached taken? It looks like following thread is blocking most of the other threads :
{nofomat}
"IPC Server handler 8 on 50020" daemon prio=10 tid=0x085ca800 nid=0x6a81 in Object.wait() [0xb0d76000..0xb0d76e20] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1143) - locked <0xb8be7008> (a org.apache.hadoop.util.Daemon) at java.lang.Thread.join(Thread.java:1196) at org.apache.hadoop.dfs.FSDataset.interruptOngoingCreates(FSDataset.java:777) - locked <0xb64e9c20> (a org.apache.hadoop.dfs.FSDataset) at org.apache.hadoop.dfs.FSDataset.updateBlock(FSDataset.java:795) - locked <0xb64e9c20> (a org.apache.hadoop.dfs.FSDataset) at org.apache.hadoop.dfs.DataNode.updateBlock(DataNode.java:3106) at org.apache.hadoop.dfs.LeaseManager.syncBlock(LeaseManager.java:503) at org.apache.hadoop.dfs.LeaseManager.recoverBlock(LeaseManager.java:471) at org.apache.hadoop.dfs.DataNode.recoverBlock(DataNode.java:3134) at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888){nofomat} Not sure if this is the cause or a side effect. I wonder why there are so many blocks under recovery. The thread dump was taken long time (about 10 hrs) after the last log message containing above exception for this datanode.
In general, from what I observed for the whole job there was an unsually high number of write errors including reduce task failures compared to 0.17.2. In the 12 hours leading up to the last exception there were 300+ exceptions like above on this datanode alone. I checked a datanode which did not become dead. It showed similar order of magnitude of exceptions. Seems like a deadlock in Datanode. This essentially freezes the datanode and eventually it will marked dead.
The deadlock is between the threads nid=0xf05 (A) and nid=0x6a81 (B). B is waiting for A to exit, but A is trying to acquire a lock held by B. It seems to be new append related code in 0.18. I don't think the exceptions are related.
> It looks like following thread is blocking most of the other threads ...
This seems a deadlock: Server handler 8 is holding the lock 0xb64e9c20 but waiting for some thread t to complete. I guess the thread t is blocked by same lock 0xb64e9c20. Yet another case that shows why interrupting a thread for thread communication is not recommended.
I should note that thread 'A' above need not be nid=0xf05, it could be one of many other threads.
4517_20081024.patch: release the lock when calling join().
4517_20081024b_0.18.patch: for 0.18
Christian, could you try the patch and see whether it works? 4517_20081024c_0.18.patch: refactored the codes.
4517_20081024d_0.18.patch: fixed a bug.
4517_20081024d.patch: for trunk
[exec] -1 overall.
[exec] +1 @author. The patch does not contain any @author tags.
[exec] -1 tests included. The patch doesn't appear to include any new or modified tests.
[exec] Please justify why no tests are needed for this patch.
[exec] +1 javadoc. The javadoc tool did not generate any warning messages.
[exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
[exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
[exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
Passed all unit tests locally. Don't know how to add new tests for this. Dhruba: Will you take a peek at this? Given the difficulty of doing a test, another person's inspection would be helpful!
Looks good. Couple of rename suggestions :
I think it will be better to get a confirmation from Christian's load before committing it.. or at least before 0.18.2 is released. Had a offline conversation with Nicholas. The latest patch could run into a problem if there is a race with a new competing writer. The activeThreads cannot be cleared before joining them. Nicholas will probably put up a slightly modified patch.
Sorry about the late response.
As a matter of fact, before the patch was available (although it was done very fast, thank you, Nicholas), the 3rd attempt was successful because I restarted DataNodes periodically. 4517_20081027.patch:
Thanks Raghu and Dhruba for reviewing. This looks good. I do not know of a unit test than can trigger this race. It might be tough to trigger this race in a standalone unit test.
4517_20081027_0.18.patch: for 0.18
[exec] -1 overall.
[exec] +1 @author. The patch does not contain any @author tags.
[exec] -1 tests included. The patch doesn't appear to include any new or modified tests.
[exec] Please justify why no tests are needed for this patch.
[exec] +1 javadoc. The javadoc tool did not generate any warning messages.
[exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
[exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
[exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
I am running unit tests locally. Will commit them once they passed the tests. I just committed this.
Integrated in Hadoop-trunk #645 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/645/
. Release FSDataset lock before joining ongoing create threads. (szetszwo) |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2008-10-23 17:30:21,703 INFO org.apache.hadoop.dfs.DataNode: writeBlock blk_4622591820056312866_6335826 received exception java.io.EOFException: while trying to read 65625 bytes
2008-10-23 17:30:21,703 ERROR org.apache.hadoop.dfs.DataNode: DatanodeRegistration(xxx.yyy.zzz.uuu:50010, storageID=DS-1873593537-xx.yyy.zzz.uuu-vvvv-1216412543041, infoPort=50075, ipcPort=50020):DataXceiver: java.io.EOFException: while trying to read 65625 bytes
at org.apache.hadoop.dfs.DataNode$BlockReceiver.readToBuf(DataNode.java:2464)
at org.apache.hadoop.dfs.DataNode$BlockReceiver.readNextPacket(DataNode.java:2508)
at org.apache.hadoop.dfs.DataNode$BlockReceiver.receivePacket(DataNode.java:2572)
at org.apache.hadoop.dfs.DataNode$BlockReceiver.receiveBlock(DataNode.java:2698)
at org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:1283)
at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:1045)
at java.lang.Thread.run(Thread.java:619)