Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Abandoned
-
0.20.2, 0.21.0, 0.22.0
-
None
Description
Believe it or not, I'm seeing HADOOP-3657 / HADOOP-3673 in a 5-node 0.20 cluster. I have many concurrent writes on the cluster, and when I kill a DN, some percentage of the time I get one of these cross-node deadlocks among 3 of the nodes (replication 3). All of the DN RPC server threads are tied up waiting on RPC clients to other datanodes.