Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.7.1
-
None
-
None
-
None
Description
We have seen the use case of decommissioning DataNodes that are already dead or unresponsive, and not expected to rejoin the cluster. In a large cluster, we met more than 100 nodes were dead, decommissioning and their Under replicated blocks Blocks with no live replicas were all ZERO. Actually It has been fixed in HDFS-7374. After that, we can refreshNode twice to eliminate this case. But, seems this patch missed after refactorHDFS-7411. We are using a Hadoop version based 2.7.1 and only below operations can transition the status from Dead, DECOMMISSION_INPROGRESS to Dead, DECOMMISSIONED:
- Retire it from hdfs-exclude
- refreshNodes
- Re-add it to hdfs-exclude
- refreshNodes
So, why the code removed after refactor in the new DecommissionManager?
if (!node.isAlive) { LOG.info("Dead node " + node + " is decommissioned immediately."); node.setDecommissioned();