Description
Currently, we only export averages for rpcQueueTime and rpcProcessingTime. These metrics are most useful when looking at timeouts and slow responses, which in my experience are often caused by momentary spikes in load, which won't show up in averages over the 15+ second time intervals often used by metrics systems. We should collect at least the max queuetime and processing time over each interval, or the percentiles if it's not too expensive.
Attachments
Attachments
Issue Links
- is related to
-
HADOOP-10305 Add "rpc.metrics.quantile.enable" and "rpc.metrics.percentiles.intervals" to core-default.xml
- Closed