Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.0-alpha
    • Fix Version/s: 2.0.2-alpha
    • Component/s: metrics
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      Based on discussion in HBASE-6261 and with some HDFS devs, I'd like to make better high-percentile latency metrics a part of hadoop-common.

      I've already got a working implementation of [1], an efficient algorithm for estimating quantiles on a stream of values. It allows you to specify arbitrary quantiles to track (e.g. 50th, 75th, 90th, 95th, 99th), along with tight error bounds. This estimator can be snapshotted and reset periodically to get a feel for how these percentiles are changing over time.

      I propose creating a new MutableQuantiles class that does this. [1] isn't completely without overhead (~1MB memory for reasonably sized windows), which is why I hesitate to add it to the existing MutableStat class.

      [1] Cormode, Korn, Muthukrishnan, and Srivastava. "Effective Computation of Biased Quantiles over Data Streams" in ICDE 2005.

      1. hadoop-8541-6.patch
        30 kB
        Andrew Wang
      2. hadoop-8541-5.patch
        30 kB
        Andrew Wang
      3. hadoop-8541-4.patch
        30 kB
        Andrew Wang
      4. hadoop-8541-3.patch
        26 kB
        Andrew Wang
      5. hadoop-8541-2.patch
        26 kB
        Andrew Wang
      6. hadoop-8541-1.patch
        25 kB
        Andrew Wang

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Andrew Wang
              Reporter:
              Andrew Wang
            • Votes:
              0 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development