Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-15945

DataNodes with zero capacity and zero blocks should be decommissioned immediately

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • None
    • None
    • None

    Description

      Such as when there is a storage problem, DataNode capacity and block count sometimes become zero.
      When we tried to decommission those DataNodes, we ran into an issue that the decommission did not complete because the NameNode had not received their first block report.

      INFO  blockmanagement.DatanodeAdminManager (DatanodeAdminManager.java:startDecommission(183)) - Starting decommission of 127.0.0.1:58343 [DISK]DS-a29de094-2b19-4834-8318-76cda3bd86bf:NORMAL:127.0.0.1:58343 with 0 blocks
      INFO  blockmanagement.BlockManager (BlockManager.java:isNodeHealthyForDecommissionOrMaintenance(4587)) - Node 127.0.0.1:58343 hasn't sent its first block report.
      INFO  blockmanagement.DatanodeAdminDefaultMonitor (DatanodeAdminDefaultMonitor.java:check(258)) - Node 127.0.0.1:58343 isn't healthy. It needs to replicate 0 more blocks. Decommission In Progress is still in progress.
      

      To make matters worse, even if we stopped these DataNodes afterward, they remained in a dead&decommissioning state until NameNode restarted.

      I think those DataNodes should be decommissioned immediately even if NameNode hasn't recived the first block report.

      Attachments

        Issue Links

          Activity

            People

              tasanuma Takanobu Asanuma
              tasanuma Takanobu Asanuma
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h
                  4h