Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-13352

possible deadlock/threadleak from OverseerTriggerThread/AutoScalingWatcher during close()

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 7.7.2, 8.1, 9.0
    • None
    • None

    Description

      A recent jenkins failure in TestSimTriggerIntegration lead me to what appears to be a "lock leak" situation in OverseerTriggerThread in how the "updateLock" object is dealt with in the event that the OverseerTriggerThread is closed.

      It's possible that this only affects tests using the SimCloudManager when calling "simRestartOverseer" – but
      I believe this can lead also lead to an actual deadlock / threadleak situation in a thread running AutoScalingWatcher (that hold a refrefrences to OverseerTriggerThread and every object reachable from it) when the OverseerTriggerThread is closed as part of a real Solr shutdown ... which i think would cause the JVM to stall untill externally killed.


      If my analysis of the test failure (to follow in comment) is correct, then even even if this bug isn't likely to affect real world solr instances (and only surfaces because of how OverseerTriggerThread is used in SimCloudManager) the fix to OverseerTriggerThread is a trivial change to follow locking best practices (patch to follow)

      Attachments

        1. SOLR-13352.patch
          3 kB
          Chris M. Hostetter
        2. sarowe_Lucene-Solr-tests-master_20462.log.txt
          2.78 MB
          Chris M. Hostetter

        Activity

          People

            hossman Chris M. Hostetter
            hossman Chris M. Hostetter
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: