Details
-
Improvement
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
8.6.3, 9.0
-
None
Description
I have a solr cluster with 300 Collections, use Prometheus Metric Exporter program to get solr-cluster information, but it takes 2 minutes to get data each time, `jstack` is as follows:
"solr-exporter-collectors-1-thread-2" #21 prio=5 os_prio=0 tid=0x00007fcef8009000 nid=0x45208 runnable [0x00007fcf16470000] java.lang.Thread.State: RUNNABLE at io.prometheus.client.Collector$MetricFamilySamples$Sample.equals(Collector.java:95) at java.util.ArrayList.indexOf(ArrayList.java:323) at java.util.ArrayList.contains(ArrayList.java:306) at org.apache.solr.prometheus.collector.MetricSamples.addSampleIfMetricExists(MetricSamples.java:50) at org.apache.solr.prometheus.collector.MetricSamples.addAll(MetricSamples.java:60) at org.apache.solr.prometheus.collector.MetricsCollector.lambda$collect$0(MetricsCollector.java:38) at org.apache.solr.prometheus.collector.MetricsCollector$$Lambda$127/68757342.accept(Unknown Source) at java.util.HashMap.forEach(HashMap.java:1291) at org.apache.solr.prometheus.collector.MetricsCollector.collect(MetricsCollector.java:38) at org.apache.solr.prometheus.collector.SchedulerMetricsCollector.lambda$collectMetrics$0(SchedulerMetricsCollector.java:91) at org.apache.solr.prometheus.collector.SchedulerMetricsCollector$$Lambda$75/817493591.get(Unknown Source) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:212) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$39/351002168.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750)
"contains" method takes 90% of execution time.
Looking at the MetricSamples.java code, "sample" will be deduplicated before adding to "sampleFamily.samples", when "sampleFamily.samples" reaches 20,000, "sampleFamily.samples.contains" is very inefficient
Attachments
Issue Links
- links to