Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-10177

Replace/improve Percentiles metrics

Agile BoardAttach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • metrics
    • None

    Description

      There's an existing – but seemingly unused – implementation of percentile metrics that we attempted to use for end-to-end latency metrics in Streams. Unfortunately a number of limitations became apparent, and we ultimately pulled the metrics from the 2.6 release pending further investigation/improvement.

      The problems we encountered were

      1. Need to set a static upper/lower limit for the values
      2. Not well suited to a distribution with a long tail, ie setting the max value too high caused the accuracy to plummet
      3. Required a lot of memory per metric for reasonable accuracy and caused us to hit OOM (unclear if there was actually a memory leak, or it was just gobbling up unnecessarily large amounts in general)

      Since the Percentiles class is part of the public API, we may need to create a new class altogether and possibly deprecate/remove the old one. Alternatively we can consider just re-implementing the existing class from scratch, and just deprecating the current constructors and associated implementation (eg the constructor accepts a max)

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            ableegoldman A. Sophie Blee-Goldman

            Dates

              Created:
              Updated:

              Slack

                Issue deployment