Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1056

Multi-node RPC deadlocks during block recovery

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.20.2, 0.21.0, 0.22.0
    • Fix Version/s: 0.20-append
    • Component/s: datanode
    • Labels:
      None

      Description

      Believe it or not, I'm seeing HADOOP-3657 / HADOOP-3673 in a 5-node 0.20 cluster. I have many concurrent writes on the cluster, and when I kill a DN, some percentage of the time I get one of these cross-node deadlocks among 3 of the nodes (replication 3). All of the DN RPC server threads are tied up waiting on RPC clients to other datanodes.

        Activity

        Allen Wittenauer made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Target Version/s
        Resolution Unresolved [ 9 ]
        Eli Collins made changes -
        Target Version/s 2.0.1 [ 12321440 ]
        Nicolas Spiegelberg made changes -
        dhruba borthakur made changes -
        Field Original Value New Value
        Fix Version/s 0.20-append [ 12315103 ]
        Todd Lipcon created issue -

          People

          • Assignee:
            Unassigned
            Reporter:
            Todd Lipcon
          • Votes:
            0 Vote for this issue
            Watchers:
            15 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development