Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-325

RM CapacityScheduler can deadlock when getQueueInfo() is called and a container is completing

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.0.2-alpha, 0.23.5
    • Fix Version/s: 2.0.3-alpha, 0.23.6
    • Component/s: capacityscheduler
    • Labels:
      None

      Description

      If a client calls getQueueInfo on a parent queue (e.g.: the root queue) and containers are completing then the RM can deadlock. getQueueInfo() locks the ParentQueue and then calls the child queues' getQueueInfo() methods in turn. However when a container completes, it locks the LeafQueue then calls back into the ParentQueue. When the two mix, it's a recipe for deadlock.

      Stacktrace to follow.

        Attachments

        1. YARN-325-branch23.patch
          8 kB
          Thomas Graves
        2. YARN-325.patch
          9 kB
          Arun C Murthy
        3. YARN-325.patch
          7 kB
          Arun C Murthy

          Activity

            People

            • Assignee:
              acmurthy Arun C Murthy
              Reporter:
              jlowe Jason Lowe
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: