Details
Description
When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found there is a lot of slow peer but ReportingNodes's averageDelay is very low, and these slow peer node are normal. I think the reason of why generating so many slow peer is that the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is too small (only 5ms) and it is not configurable. The default value of slow io warning log threshold is 300ms, i.e. DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise namenode will get a lot of invalid slow peer information.
Attachments
Attachments
Issue Links
- links to