Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-1255

Name-node falls into infinite loop trying to remove a dead node.

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.12.3
    • Fix Version/s: 0.13.0
    • Component/s: None
    • Labels:
      None

      Description

      Under certain conditions the name-node fall into infinite loop in heartbeatCheck().
      It's rather hard to reproduce. I'm running one node cluster: 1 name-node, 1 data-node.
      The data-node dies, and 10 minutes later I get

      07/04/12 10:40:34 INFO net.NetworkTopology: Removing a node: /default-rack/0.0.0.0:50077
      07/04/12 10:44:35 INFO dfs.StateChange: BLOCK* NameSystem.heartbeatCheck: lost heartbeat from 0.0.0.0:50077
      ...................................................
      07/04/12 10:45:17 INFO net.NetworkTopology: Removing a node: /default-rack/0.0.0.0:50077
      07/04/12 10:47:44 INFO dfs.StateChange: BLOCK* NameSystem.heartbeatCheck: lost heartbeat from 0.0.0.0:50077

      Here is what I see in the debugger:
      FSNamesystem.heartbeats contains 2 identical (same instance) DatanodeDescriptor entries, both have
      DatanodeDescriptor.isAlive = false. The heartbeatCheck() correctly detects that there is a dead node in
      the list, but removeDatanode() does not delete the node from the heartbeats because it is dead.

        Attachments

        1. heartbeat.patch
          0.7 kB
          Hairong Kuang
        2. heartbeat.patch
          1 kB
          Hairong Kuang

          Issue Links

            Activity

              People

              • Assignee:
                hairong Hairong Kuang
                Reporter:
                shv Konstantin Shvachko
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: