I just discussed this with a colleague.
The get, put, etc, histograms that each region server keeps are somewhat useless (depending on what you want to achieve of course), as they are aggregated and calculated by each region server.
It would be better to record the number of requests in certainly latency bands in addition to what we do now.
For example the number of gets that took 0-5ms, 6-10ms, 10-20ms, 20-50ms, 50-100ms, 100-1000ms, > 1000ms, etc. (just as an example, should be configurable).
That way we can do further calculations after the fact, and answer questions like: How often did we miss our SLA? Percentage of requests that missed an SLA, etc.