[HBASE-6261] Better approximate high-percentile percentile latency metrics - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: None
Component/s: metrics
Labels:
- metrics

Description

The existing reservoir-sampling based latency metrics in HBase are not well-suited for providing accurate estimates of high-percentile (e.g. 90th, 95th, or 99th) latency. This is a well-studied problem in the literature (see [1] and [2]), the question is determining which methods best suit our needs and then implementing it.

Ideally, we should be able to estimate these high percentiles with minimal memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% on 99th). It's also desirable to provide this over different time-based sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.

I'll note that this would also be useful in HDFS, or really anywhere latency metrics are kept.

[1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
[2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

Latencyestimation.pdf
27/Jun/12 01:06
66 kB
Andrew Wang
MetricsHistogram.data
23/Jul/12 23:34
19 kB
Andrew Wang
parse.py
23/Jul/12 23:34
5 kB
Andrew Wang
SampleQuantiles.data
23/Jul/12 23:34
19 kB
Andrew Wang

Issue Links

is related to

HBASE-6409 Create histogram class for metrics 2

Closed

HADOOP-8541 Better high-percentile latency metrics

Closed

Activity

People

Assignee:: Andrew Wang

Reporter:: Andrew Wang

Votes:: 0 Vote for this issue

Watchers:: 18 Start watching this issue

Dates

Created:: 22/Jun/12 20:30

Updated:: 13/Jun/22 16:53

Resolved:: 28/Dec/12 15:22