Description
We've noticed this issue quite a bit. If it happens, the replica is marked as down. Workaround seems to be restarting Solr service, but this is quite random and it might not be feasible.
Today I noticed that it seemed to be hanging while loading the replica. When the service stopped, these messages were logged.
2024-08-20 18:27:15.827 INFO (coreLoadExecutor-17-thread-1-processing-ip-100-65-231-167.ec2.internal:8983_solr) [c:1_80084c8562132c
47_2d076556_1914ca95bf9_8000I18454740_a2f8_5f48_a1d7_9ecfea41540d s:shard1 r:core_node18 x:1_80084c8562132c47_2d076556_1914ca95bf9
_8000I18454740_a2f8_5f48_a1d7_9ecfea41540d_shard1_replica_n17] o.a.s.c.SolrCore Interrupted waiting for searcherLock => java.lang.In
terruptedException
at java.base/java.lang.Object.wait(Native Method)
java.lang.InterruptedException: null
at java.lang.Object.wait(Native Method) ~[?:?]
at java.lang.Object.wait(Object.java:338) ~[?:?]
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2538) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b6
0d10bfe96 - stillalex - 2023-10-10 19:10:39]
at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1290) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b
60d10bfe96 - stillalex - 2023-10-10 19:10:39]
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:1175) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10b
fe96 - stillalex - 2023-10-10 19:10:39]
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:1056) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10b
fe96 - stillalex - 2023-10-10 19:10:39]
at org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1705) ~[solr-core-9.4.0.jar:9.4.0 71e101bb3749
7f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]
at org.apache.solr.core.CoreContainer.lambda$loadInternal$12(CoreContainer.java:1043) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37
497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]
at com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:234) ~[metric
s-core-4.2.20.jar:4.2.20]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
There were tons of threads waiting for the lock as well:
2024-08-20 18:28:16.012 INFO (qtp1768242710-971) [] o.a.s.c.SolrCore Interrupted waiting for searcherLock => java.lang.InterruptedException
at java.base/java.lang.Object.wait(Native Method)
java.lang.InterruptedException: null
at java.lang.Object.wait(Native Method) ~[?:?]
at java.lang.Object.wait(Object.java:338) ~[?:?]
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2538) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2281) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2116) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]
at org.apache.solr.core.SolrCore.withSearcher(SolrCore.java:2134) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]
at org.apache.solr.core.SolrCore.getSegmentCount(SolrCore.java:539) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]
at org.apache.solr.core.SolrCore.lambda$initializeMetrics$11(SolrCore.java:1360) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]
at org.apache.solr.util.stats.MetricUtils.convertGauge(MetricUtils.java:656) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]
at org.apache.solr.util.stats.MetricUtils.convertMetric(MetricUtils.java:355) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]
Attachments
Issue Links
- duplicates
-
SOLR-17060 CoreContainer#create may deadlock with concurrent requests for metrics
- Closed