Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-10262

Collect request latency metrics for histograms

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 7.0
    • Component/s: metrics
    • Security Level: Public (Default Security Level. Issues are Public)
    • Labels:
      None

      Description

      Since Andrzej Bialecki is on a role with metrics...
      There is no way to accurately compute request latency percentiles from metrics exposed by Solr today. We should consider making that possible. c.f. https://github.com/HdrHistogram/HdrHistogram

      1. SOLR-10262.patch
        52 kB
        Andrzej Bialecki

        Issue Links

          Activity

          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 3217fd7c3cc7868648133950bc98d1b480c673b7 in lucene-solr's branch refs/heads/master from Andrzej Bialecki
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3217fd7 ]

          SOLR-10262: Add support for configurable metrics implementations.

          Show
          jira-bot ASF subversion and git services added a comment - Commit 3217fd7c3cc7868648133950bc98d1b480c673b7 in lucene-solr's branch refs/heads/master from Andrzej Bialecki [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3217fd7 ] SOLR-10262 : Add support for configurable metrics implementations.
          Hide
          ab Andrzej Bialecki added a comment -

          Otis Gospodnetic if you don't have any objections to the current patch I'd like to commit it soon.

          Show
          ab Andrzej Bialecki added a comment - Otis Gospodnetic if you don't have any objections to the current patch I'd like to commit it soon.
          Hide
          ab Andrzej Bialecki added a comment -

          Otis Gospodnetic - actually, I hesitated to add either HdrHistogram or the one from rolling-metrics. The first one doesn't implement exponential decay or at least some form of incremental reset (eg. time-window based), and reset-on-snapshot will never work correctly with multiple reporters. The rolling-metrics package implements exponential decay and much more, but it does it by using multiple partial histograms for each Histogram instance, which concerns me from the point of view of overall memory impact (we create hundreds of metrics per core), background thread mgmt, etc.

          So, from my point of view the patch as it is now may be sufficient and maybe we should stop here - it allows you to customize what implementation and parameters of each metric is used, and it allows you to provide your own implementation and evaluate its impact. Please review the patch and let me know what you think.

          Show
          ab Andrzej Bialecki added a comment - Otis Gospodnetic - actually, I hesitated to add either HdrHistogram or the one from rolling-metrics. The first one doesn't implement exponential decay or at least some form of incremental reset (eg. time-window based), and reset-on-snapshot will never work correctly with multiple reporters. The rolling-metrics package implements exponential decay and much more, but it does it by using multiple partial histograms for each Histogram instance, which concerns me from the point of view of overall memory impact (we create hundreds of metrics per core), background thread mgmt, etc. So, from my point of view the patch as it is now may be sufficient and maybe we should stop here - it allows you to customize what implementation and parameters of each metric is used, and it allows you to provide your own implementation and evaluate its impact. Please review the patch and let me know what you think.
          Hide
          ab Andrzej Bialecki added a comment -

          This patch adds support for configuring existing metric implementations (eg. setting reservoir impl / parameters, clock, etc) as well as using custom implementations.

          This support is needed in order to cleanly add HdrHistogram or implementations available in rolling metrics.

          Show
          ab Andrzej Bialecki added a comment - This patch adds support for configuring existing metric implementations (eg. setting reservoir impl / parameters, clock, etc) as well as using custom implementations. This support is needed in order to cleanly add HdrHistogram or implementations available in rolling metrics .
          Hide
          otis Otis Gospodnetic added a comment -

          This would have to be configured early on in solr.xml or even via system properties, which is a bit ugly.

          Not sure what exactly you mean by this, but I don't think it should be the new default because of http://search-lucene.com/m/Lucene/l6pAi15LobI6m5Ny1?subj=Solr+JMX+changes+and+backwards+in+compatibility . I am hoping it can be added to whatever is already there. Then people and tools that monitor Solr can decide which data they want to collect. The old stuff could be marked/announced as deprecated if we really don't want/need that data, and removed in one of the future releases.

          Show
          otis Otis Gospodnetic added a comment - This would have to be configured early on in solr.xml or even via system properties, which is a bit ugly. Not sure what exactly you mean by this, but I don't think it should be the new default because of http://search-lucene.com/m/Lucene/l6pAi15LobI6m5Ny1?subj=Solr+JMX+changes+and+backwards+in+compatibility . I am hoping it can be added to whatever is already there. Then people and tools that monitor Solr can decide which data they want to collect. The old stuff could be marked/announced as deprecated if we really don't want/need that data, and removed in one of the future releases.
          Hide
          ab Andrzej Bialecki added a comment -

          A few relevant links:

          • discussion thread where Gil Tene (from Azul) explains why default reservoirs in Codahale Metrics are suboptimal, to put it mildly...
          • How NOT to measure latency
          • HrdHistogram reservoir implementation. Notably it contains HdrHistogramReservoir which tracks values since its creation, and HdrHistogramResetOnSnapshotReservoir which resets its state to zero after each snapshot. There's no equivalent exponentially-decayed reservoir, and we can't use the one that resets on snapshot (because multiple clients may read from it at different intervals).
          Show
          ab Andrzej Bialecki added a comment - A few relevant links: discussion thread where Gil Tene (from Azul) explains why default reservoirs in Codahale Metrics are suboptimal, to put it mildly... How NOT to measure latency HrdHistogram reservoir implementation. Notably it contains HdrHistogramReservoir which tracks values since its creation, and HdrHistogramResetOnSnapshotReservoir which resets its state to zero after each snapshot. There's no equivalent exponentially-decayed reservoir, and we can't use the one that resets on snapshot (because multiple clients may read from it at different intervals).
          Hide
          ab Andrzej Bialecki added a comment -

          Thanks Otis Yes, I'm aware of that implementation and the discussion around it. It's an easy fix to replace the ExponentiallyDecayingReservoir that metrics use by default with HdrHistogram, there's even a pull request against Codahale repo with an implementation.

          By "making it possible" did you mean that we make it the new default, but we also allow people to use a different implementation? This would have to be configured early on in solr.xml or even via system properties, which is a bit ugly.

          Show
          ab Andrzej Bialecki added a comment - Thanks Otis Yes, I'm aware of that implementation and the discussion around it. It's an easy fix to replace the ExponentiallyDecayingReservoir that metrics use by default with HdrHistogram, there's even a pull request against Codahale repo with an implementation. By "making it possible" did you mean that we make it the new default, but we also allow people to use a different implementation? This would have to be configured early on in solr.xml or even via system properties, which is a bit ugly.

            People

            • Assignee:
              ab Andrzej Bialecki
              Reporter:
              otis Otis Gospodnetic
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development