Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-7714

Simultaneous restart of HA NameNodes and DataNode can cause DataNode to register successfully with only one NameNode.

    Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      In an HA deployment, DataNodes must register with both NameNodes and send periodic heartbeats and block reports to both. However, if NameNodes and DataNodes are restarted simultaneously, then this can trigger a race condition in registration. The end result is that the BPServiceActor for one NameNode terminates, but the BPServiceActor for the other NameNode remains alive. The DataNode process is then in a "half-alive" state where it only heartbeats and sends block reports to one of the NameNodes. This could cause a loss of storage capacity after an HA failover. The DataNode process would have to be restarted to resolve this.

        Attachments

        1. HDFS-7714-002.patch
          1 kB
          Vinayakumar B
        2. HDFS-7714-001.patch
          2 kB
          Vinayakumar B

          Issue Links

            Activity

              People

              • Assignee:
                vinayrpet Vinayakumar B
                Reporter:
                cnauroth Chris Nauroth
              • Votes:
                0 Vote for this issue
                Watchers:
                13 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: