[HDFS-14624] When decommissioning a node, log remaining blocks to replicate periodically - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.3.0
Fix Version/s: 3.3.0, 3.1.4, 3.2.2
Component/s: namenode
Labels:
None

Target Version/s:

3.3.0
Hadoop Flags:

Reviewed

Description

When a node is marked for decommission, there is a monitor thread which runs every 30 seconds by default, and checks if the node still has pending blocks to be replicated before the node can complete replication.

There are two existing debug level messages logged in the monitor thread, DatanodeAdminManager$Monitor.check(), which log the correct information already, first as the pending blocks are replicated:

LOG.debug("Node {} still has {} blocks to replicate "
    + "before it is a candidate to finish {}.",
    dn, blocks.size(), dn.getAdminState());

And then after the initial set of blocks has completed and a rescan happens:

LOG.debug("Node {} {} healthy."
    + " It needs to replicate {} more blocks."
    + " {} is still in progress.", dn,
    isHealthy ? "is": "isn't", blocks.size(), dn.getAdminState());

I would like to propose moving these messages to INFO level so it is easier to monitor decommission progress over time from the Namenode log.

Based on the default settings, this would result in at most 1 log message per node being decommissioned every 30 seconds. The reason this is at the most, is because the monitor thread stops after checking after 500K blocks and therefore in practice it could be as little as 1 log message per 30 seconds, even if many DNs are being decommissioned at the same time.

Note that the namenode webUI does display the above information, but having this in the NN logs would allow progress to be tracked more easily.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFS-14624.001.patch
02/Jul/19 15:48
1 kB
Stephen O'Donnell
HDFS-14624.002.patch
04/Jul/19 16:04
2 kB
Stephen O'Donnell
HDFS-14624.003.patch
11/Jul/19 14:07
2 kB
Stephen O'Donnell

Activity

People

Assignee:: Stephen O'Donnell

Reporter:: Stephen O'Donnell

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 02/Jul/19 15:30

Updated:: 04/Oct/19 00:38

Resolved:: 11/Jul/19 15:55