The Metrics library (https://dropwizard.github.io/metrics/3.1.0/) is a well-known way to track metrics about applications.
SOLR-1972, latency percentile tracking was added. The comment list is long, so here’s my synopsis:
1. An attempt was made to use the Metrics library
2. That attempt failed due to a memory leak in Metrics v2.1.1
3. Large parts of Metrics were then copied wholesale into the org.apache.solr.util.stats package space and that was used instead.
Copy/pasting Metrics code into Solr may have been the correct solution at the time, but I submit that it isn’t correct any more.
The leak in Metrics was fixed even before
SOLR-1972 was released, and by copy/pasting a subset of the functionality, we miss access to other important things that the Metrics library provides, particularly the concept of a Reporter. (https://dropwizard.github.io/metrics/3.1.0/manual/core/#reporters)
Further, Metrics v3.0.2 is already packaged with Solr anyway, because it’s used in two contrib modules. (map-reduce and morphines-core)
I’m proposing that:
1. Metrics as bundled with Solr be upgraded to the current v3.1.2
2. Most of the org.apache.solr.util.stats package space be deleted outright, or gutted and replaced with simple calls to Metrics. Due to the copy/paste origin, the concepts should mostly map 1:1.
I’d further recommend a usage pattern like:
There are all kinds of areas in Solr that could benefit from metrics tracking and reporting. This pattern allows diverse areas of code to track metrics within a single, named registry. This well-known-name then becomes a handle you can use to easily attach a Reporter and ship all of those metrics off-box.