[SOLR-13234] Prometheus Metric Exporter Not Threadsafe - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: 7.6, 8.0
Fix Version/s: 7.7.2, 8.1, 9.0
Component/s: contrib - prometheus-exporter, metrics
Labels:
- metric-collector

Description

The Solr Prometheus Exporter collects metrics when it receives a HTTP request from Prometheus. Prometheus sends this request, on its scrape interval. When the time taken to collect the Solr metrics is greater than the scrape interval of the Prometheus server, this results in concurrent metric collection occurring in this method. This method doesn’t appear to be thread safe, for instance you could have concurrent modifications of a map. After a while the Solr Exporter processes becomes nondeterministic, we've observed NPE and loss of metrics.

To address this, I'm proposing the following fixes:

1. Read/parse the configuration at startup and make it immutable.
2. Collect metrics from Solr on an interval which is controlled by the Solr Exporter and cache the metric samples to return during Prometheus scraping. Metric collection can be expensive, for example executing arbitrary Solr searches, it's not ideal to allow for concurrent metric collection and on an interval which is not defined by the Solr Exporter.

There are also a few other performance improvements that we've made while fixing this, for example using the ClusterStateProvider instead of sending multiple HTTP requests to each Solr node to lookup all the cores.

I'm currently finishing up these changes which I'll submit as a PR.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

SOLR-13234-branch_7x.patch
05/Mar/19 04:22
414 kB
Shalin Shekhar Mangar

Issue Links

causes

SOLR-13392 Unable to start prometheus-exporter in 7x branch

Closed

links to

GitHub Pull Request #571

GitHub Pull Request #605

Activity

People

Assignee:: Shalin Shekhar Mangar

Reporter:: Danyal Prout

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 08/Feb/19 15:52

Updated:: 29/Nov/20 06:23

Resolved:: 14/Mar/19 05:48

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

40m