Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-6261

Better approximate high-percentile percentile latency metrics

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • None
    • metrics

    Description

      The existing reservoir-sampling based latency metrics in HBase are not well-suited for providing accurate estimates of high-percentile (e.g. 90th, 95th, or 99th) latency. This is a well-studied problem in the literature (see [1] and [2]), the question is determining which methods best suit our needs and then implementing it.

      Ideally, we should be able to estimate these high percentiles with minimal memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% on 99th). It's also desirable to provide this over different time-based sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.

      I'll note that this would also be useful in HDFS, or really anywhere latency metrics are kept.

      [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
      [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

      Attachments

        1. Latencyestimation.pdf
          66 kB
          Andrew Wang
        2. MetricsHistogram.data
          19 kB
          Andrew Wang
        3. parse.py
          5 kB
          Andrew Wang
        4. SampleQuantiles.data
          19 kB
          Andrew Wang

        Issue Links

          Activity

            People

              andrew.wang Andrew Wang
              andrew.wang Andrew Wang
              Votes:
              0 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: