When a node is marked for decommission, there is a monitor thread which runs every 30 seconds by default, and checks if the node still has pending blocks to be replicated before the node can complete replication.
There are two existing debug level messages logged in the monitor thread, DatanodeAdminManager$Monitor.check(), which log the correct information already, first as the pending blocks are replicated:
And then after the initial set of blocks has completed and a rescan happens:
I would like to propose moving these messages to INFO level so it is easier to monitor decommission progress over time from the Namenode log.
Based on the default settings, this would result in at most 1 log message per node being decommissioned every 30 seconds. The reason this is at the most, is because the monitor thread stops after checking after 500K blocks and therefore in practice it could be as little as 1 log message per 30 seconds, even if many DNs are being decommissioned at the same time.
Note that the namenode webUI does display the above information, but having this in the NN logs would allow progress to be tracked more easily.