Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
8.11.1
-
None
-
Java 17
OS: Windows 2016
-
New
Description
Hello,
a deadlock situation happened in our application. We are using MMapDirectory on Windows 2016 and got the following stacktrace:
"https-openssl-nio-443-exec-30" #166 daemon prio=5 os_prio=0 cpu=78703.13ms "https-openssl-nio-443-exec-30" #166 daemon prio=5 os_prio=0 cpu=78703.13ms elapsed=81248.18s tid=0x000000002860af10 nid=0x237c in Object.wait() [0x00000000413fc000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(java.base@17.0.2/Native Method) - waiting on <no object reference available> at org.apache.lucene.index.IndexWriter.doWait(IndexWriter.java:4983) - locked <0x00000006ef1fc020> (a org.apache.lucene.index.IndexWriter) at org.apache.lucene.index.IndexWriter.waitForMerges(IndexWriter.java:2697) - locked <0x00000006ef1fc020> (a org.apache.lucene.index.IndexWriter) at org.apache.lucene.index.IndexWriter.shutdown(IndexWriter.java:1236) at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1278) at com.speed4trade.ebs.module.search.SearchService.updateSearchIndex(SearchService.java:1723) - locked <0x00000006d5c00208> (a org.apache.lucene.store.MMapDirectory) at com.speed4trade.ebs.module.businessrelations.ticket.TicketChangedListener.postUpdate(TicketChangedListener.java:142) ...
All threads were waiting to lock <0x00000006d5c00208> which got never released.
A lucene thread was also blocked, I dont know if this is relevant:
"Lucene Merge Thread #0" #18466 daemon prio=5 os_prio=0 cpu=15.63ms elapsed=3499.07s tid=0x00000000459453e0 nid=0x1f8 waiting for monitor entry [0x000000005da9e000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.lucene.store.FSDirectory.deletePendingFiles(FSDirectory.java:346) - waiting to lock <0x00000006d5c00208> (a org.apache.lucene.store.MMapDirectory) at org.apache.lucene.store.FSDirectory.maybeDeletePendingFiles(FSDirectory.java:363) at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:248) at org.apache.lucene.store.LockValidatingDirectoryWrapper.createOutput(LockValidatingDirectoryWrapper.java:44) at org.apache.lucene.index.ConcurrentMergeScheduler$1.createOutput(ConcurrentMergeScheduler.java:289) at org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:43) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.<init>(CompressingStoredFieldsWriter.java:121) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:130) at org.apache.lucene.codecs.lucene87.Lucene87StoredFieldsFormat.fieldsWriter(Lucene87StoredFieldsFormat.java:141) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:227) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:105) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4757) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4361) at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5920) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:626) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684)
If looks like the merge operation never finished and released the lock.
Is there any option to prevent this deadlock or how to investigate it further?
A load-test didn't show this problem unfortunately.
Attachments
Issue Links
- links to