Uploaded image for project: 'Apache YuniKorn'
  1. Apache YuniKorn
  2. YUNIKORN-572

terminated app move causes deadlock

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      PR #250 for the placeholder cleanup introduced a possible dead lock.

      When moving a terminated app from the queue the queue is unlinked from the app. Before that happens we make sure that the queue tracking is up to date. This requires an application lock.

      The move used to take a lock on the partition and did all its work in one go. With the change to remove the queue from the app we now lock the app inside the partition lock. Since the app is part of the active list until the move is done scheduling might check the app too. This could lead to a dead lock.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            wilfreds Wilfred Spiegelenburg
            wilfreds Wilfred Spiegelenburg
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment