Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
HDFS-7240
Description
The Datanode ID is used when a data node registers. It is assumed that datanodes are unique across the cluster.
However due to operator error or other cases we might encounter duplicate datanode ID. SCM should be able to recognize this and handle in correctly. Here is a sub-set of datanode scenarios it needs to handle.
1. Normal Datanode
2. Copy of a Datanode metadata by operator to another node
3. A Datanode being renamed - hostname change
4. Container Reports – 2 machines with same datanode ID. SCM thinks they are same node.
5. Decommission – we decommission both nodes if IDs are same.
6. Commands will be send to both nodes.
So it is necessary that SCM identity when a datanode is reusing a datanode ID that is already used by another node.