Uploaded image for project: 'Apache YuniKorn'
  1. Apache YuniKorn
  2. YUNIKORN-572

terminated app move causes deadlock

    XMLWordPrintableJSON

Details

    Description

      PR #250 for the placeholder cleanup introduced a possible dead lock.

      When moving a terminated app from the queue the queue is unlinked from the app. Before that happens we make sure that the queue tracking is up to date. This requires an application lock.

      The move used to take a lock on the partition and did all its work in one go. With the change to remove the queue from the app we now lock the app inside the partition lock. Since the app is part of the active list until the move is done scheduling might check the app too. This could lead to a dead lock.

      Attachments

        Issue Links

          Activity

            People

              wilfreds Wilfred Spiegelenburg
              wilfreds Wilfred Spiegelenburg
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: