Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1056

Multi-node RPC deadlocks during block recovery

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.20.2, 0.21.0, 0.22.0
    • Fix Version/s: 0.20-append
    • Component/s: datanode
    • Labels:
      None
    • Target Version/s:

      Description

      Believe it or not, I'm seeing HADOOP-3657 / HADOOP-3673 in a 5-node 0.20 cluster. I have many concurrent writes on the cluster, and when I kill a DN, some percentage of the time I get one of these cross-node deadlocks among 3 of the nodes (replication 3). All of the DN RPC server threads are tied up waiting on RPC clients to other datanodes.

        Activity

        Todd Lipcon created issue -
        dhruba borthakur made changes -
        Field Original Value New Value
        Fix Version/s 0.20-append [ 12315103 ]
        Nicolas Spiegelberg made changes -
        Eli Collins made changes -
        Target Version/s 2.0.1 [ 12321440 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            Todd Lipcon
          • Votes:
            0 Vote for this issue
            Watchers:
            14 Start watching this issue

            Dates

            • Created:
              Updated:

              Development