Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-10583

Deadlock with MMapDirectory while waitForMerges

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 8.11.1
    • 10.0 (main), 9.4
    • core/index
    • None
    • Java 17

      OS: Windows 2016

    • New

    Description

      Hello,

      a deadlock situation happened in our application. We are using MMapDirectory on Windows 2016 and got the following stacktrace:

      "https-openssl-nio-443-exec-30" #166 daemon prio=5 os_prio=0 cpu=78703.13ms "https-openssl-nio-443-exec-30" #166 daemon prio=5 os_prio=0 cpu=78703.13ms elapsed=81248.18s tid=0x000000002860af10 nid=0x237c in Object.wait()  [0x00000000413fc000]
         java.lang.Thread.State: TIMED_WAITING (on object monitor)
          at java.lang.Object.wait(java.base@17.0.2/Native Method)
          - waiting on <no object reference available>
          at org.apache.lucene.index.IndexWriter.doWait(IndexWriter.java:4983)
          - locked <0x00000006ef1fc020> (a org.apache.lucene.index.IndexWriter)
          at org.apache.lucene.index.IndexWriter.waitForMerges(IndexWriter.java:2697)
          - locked <0x00000006ef1fc020> (a org.apache.lucene.index.IndexWriter)
          at org.apache.lucene.index.IndexWriter.shutdown(IndexWriter.java:1236)
          at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1278)
          at com.speed4trade.ebs.module.search.SearchService.updateSearchIndex(SearchService.java:1723)
          - locked <0x00000006d5c00208> (a org.apache.lucene.store.MMapDirectory)
          at com.speed4trade.ebs.module.businessrelations.ticket.TicketChangedListener.postUpdate(TicketChangedListener.java:142)
      ...

      All threads were waiting to lock <0x00000006d5c00208> which got never released.

      A lucene thread was also blocked, I dont know if this is relevant:

      "Lucene Merge Thread #0" #18466 daemon prio=5 os_prio=0 cpu=15.63ms elapsed=3499.07s tid=0x00000000459453e0 nid=0x1f8 waiting for monitor entry  [0x000000005da9e000]
         java.lang.Thread.State: BLOCKED (on object monitor)
          at org.apache.lucene.store.FSDirectory.deletePendingFiles(FSDirectory.java:346)
          - waiting to lock <0x00000006d5c00208> (a org.apache.lucene.store.MMapDirectory)
          at org.apache.lucene.store.FSDirectory.maybeDeletePendingFiles(FSDirectory.java:363)
          at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:248)
          at org.apache.lucene.store.LockValidatingDirectoryWrapper.createOutput(LockValidatingDirectoryWrapper.java:44)
          at org.apache.lucene.index.ConcurrentMergeScheduler$1.createOutput(ConcurrentMergeScheduler.java:289)
          at org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:43)
          at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.<init>(CompressingStoredFieldsWriter.java:121)
          at org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:130)
          at org.apache.lucene.codecs.lucene87.Lucene87StoredFieldsFormat.fieldsWriter(Lucene87StoredFieldsFormat.java:141)
          at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:227)
          at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:105)
          at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4757)
          at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4361)
          at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5920)
          at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:626)
          at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684)

      If looks like the merge operation never finished and released the lock.

      Is there any option to prevent this deadlock or how to investigate it further?
      A load-test didn't show this problem unfortunately.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              tom_s4t Thomas Hoffmann
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m