Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
8.6.3
-
None
-
None
Description
Setup: Solr 8.6.3, external ZooKeeper 3.6.2 on Linux, 20 shards (5 x 4 nodes).
The issues have occurred twice when issuing 'service solr stop' command.
- Errors like 'waiting on condition', 'TIMED_WAITING',
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.271-b09 mixed mode): "Attach Listener" #142 daemon prio=9 os_prio=0 tid=0x00007f72b4001000 nid=0x15777 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "AutoscalingActionExecutor-44-thread-1" #141 prio=5 os_prio=0 tid=0x00007f71b800f000 nid=0x152f7 waiting on condition [0x00007f72c1b35000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x000000078e65cb38> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
- It eventually killed the process.
- Somehow the PID files in SOLR_PID_DIR got deleted. This is critical because without those files, a lot of commands, processes will not work.
- Checked the data directories, loads of write.lock files were generated. For example: 'solr86-8986/data/testdata_shard14_replica_n82/data/index/write.lock'