Details
Description
When any datanode is reported to be slower by another node, we expose the slow node as well as the reporting nodes list for the slow node. However, we don't provide latency numbers of the slownode as reported by the reporting node. Having the latency exposed in the metrics would be really helpful for operators to keep a track of how far behind a given slow node is performing compared to the rest of the nodes in the cluster.
The operator should be able to gather aggregated latencies of all slow nodes with their reporting nodes in Namenode metrics.
Attachments
Issue Links
- is related to
-
HDFS-11194 Maintain aggregated peer performance metrics on NameNode
- Resolved
-
HDFS-16521 DFS API to retrieve slow datanodes
- Resolved
- relates to
-
HDFS-16595 Slow peer metrics - add median, mad and upper latency limits
- Resolved
- links to