Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7570

Tragic events during merges can lead to deadlock

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 5.5, 7.0
    • 5.5.4, 6.4, 7.0
    • core/index
    • None
    • New

    Description

      When an IndexWriter#commit() is stalled due to too many pending merges, you can get a deadlock if the currently active merge thread hits a tragic event.

      1. The thread performing the commit synchronizes on the the commitLock in commitInternal.
      2. The thread goes on to to call ConcurrentMergeScheduler#doStall() which waits() on the ConcurrentMergeScheduler object. This release the merge scheduler's monitor lock, but not the commitLock in IndexWriter.
      3. Sometime after this wait begins, the merge thread gets a tragic exception can calls IndexWriter#tragicEvent() which in turn calls IndexWriter#rollbackInternal().
      4. The IndexWriter#rollbackInternal() synchronizes on the commitLock which is still held by the committing thread from (1) above which is waiting on the merge(s) to complete. Hence, deadlock.

      We hit this bug with Lucene 5.5, but I looked at the code in the master branch and it looks like the deadlock still exists there as well.

      Attachments

        1. LUCENE-7570.patch
          6 kB
          Michael McCandless
        2. thread_dump.txt
          11 kB
          Martin Amirault

        Activity

          People

            mikemccand Michael McCandless
            fwiffo Joey Echeverria
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: