I'm seeing ExcessBlocks metric increases to more than 300K in some clusters, however, there are no over-replicated blocks (confirmed by fsck).
After a further research, I noticed when deleting a block, BlockManager does not remove the block from excessReplicateMap or decrement excessBlocksCount.
Usually the metric is decremented when processing block report, however, if the block has been deleted, BlockManager does not remove the block from excessReplicateMap or decrement the metric.
That way the metric and excessReplicateMap can increase infinitely (i.e. memory leak can occur).