Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-15634

Invalidate block on decommissioning DataNode after replication

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: hdfs

      Description

      Right now when a DataNode starts decommission, Namenode will mark it as decommissioning and its blocks will be replicated over to different DataNodes, then marked as decommissioned. These blocks are not touched since they are not counted as live replicas.

      Proposal: Invalidate these blocks once they are replicated and there are enough live replicas in the cluster.

      Reason: A recent shutdown of decommissioned datanodes to finished the flow caused Namenode latency spike since namenode needs to remove all of the blocks from its memory and this step requires holding write lock. If we have gradually invalidated these blocks the deletion will be much easier and faster.

        Attachments

        1. write lock.png
          27 kB
          Fengnan Li

          Issue Links

            Activity

              People

              • Assignee:
                fengnanli Fengnan Li
                Reporter:
                fengnanli Fengnan Li
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h