Believe it or not, I'm seeing
HADOOP-3657 / HADOOP-3673 in a 5-node 0.20 cluster. I have many concurrent writes on the cluster, and when I kill a DN, some percentage of the time I get one of these cross-node deadlocks among 3 of the nodes (replication 3). All of the DN RPC server threads are tied up waiting on RPC clients to other datanodes.
|Field||Original Value||New Value|
|Fix Version/s||0.20-append [ 12315103 ]|
|Attachment||0013-HDFS-1056.-Fix-possible-multinode-deadlocks-during-b.patch [ 12447980 ]|
|Target Version/s||2.0.1 [ 12321440 ]|
|Status||Open [ 1 ]||Resolved [ 5 ]|
|Resolution||Unresolved [ 9 ]|