Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
As promised in my ApacheCon talk (recording here: https://www.youtube.com/watch?v=mR_3bbqZSRg), I'd like to ship a core set of alerting rules for monitoring SolrCloud clusters specifically on Kubernetes.
The basic guidance will be for users to import these rules into the Prometheus stack (which provides a PrometheusRule CRD). Users will need to adjust the thresholds for their specific uses (e.g. p95) and alert interval. Eventually, I'd like to have a starter runbook to accompany the alert rules, but not there yet.
Also, it's important to realize that the Prometheus stack provides a core set of alerting rules for monitoring node and pod health, so the Solr specific set should be considered complementary to those and users will be expected to use both sets, see: https://github.com/prometheus-operator/kube-prometheus/blob/0821adabf6f9f7aebf9343ecb07707826ce693ee/manifests/kubernetes-prometheusRule.yaml
If users are not running in Kubernetes or not using the Prometheus stack, the rules should still be useful as a getting started guide on what to monitor.
Attachments
Issue Links
- links to