Thanks Allen, Andrew and Akira for the discussion.
Our original intention is to solve issue which is good, thank you for working on it. About the discussion itself, Andrew's suggestion is good, and another option is to record latest time of UnderReplicatedBlocks#chooseUnderReplicatedBlocks, and we already have metrics about the underReplicatedBlocksCount/pendingReplicationBlocksCount/scheduledReplicationBlocksCount, so we can know whether/how long the under replica list is handled since last time if we really want to see. My point is not worth to record whole under replicated list for this metric.
On the other hand, we have UnderReplicatedBlocks and PendingReplicationBlocks, right? Replication monitor thread will periodically pick up some under replicated blocks, unless the NN stops (e.g, full gc), compute replication work will always happen in some CPU time slice, of course it could be slow since there maybe many things need to be handled in NN (e.g. many requests). But actually if NN is slow, we have many ways to know it. About Akira's comment about the metric is also about the entire HDFS cluster, we talk DataNode here, I think more correctly thing it's to record the timeout number of pending replication blocks (PendingReplicationBlocks) if network is very busy or target DNs corrupted if we want to get the Cluster health from replication blocks' review, UnderReplicatedBlocks can't stand for that.
So if we want to have some metrics about the replicated blocks in NN, let's find some lightweight way as suggested, thanks.