Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3414

FairScheduler's preemption may cause livelock

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.6.0
    • None
    • fairscheduler
    • None

    Description

      I met this problem in our cluster, it cause livelock during preemption and scheduling.

      Queue hierarchy described as below:

                            root
                    /        |        \
                queue-1    queue-2    queue-3     
                /    \
      queue-1-1      queue-1-2
      
      1. Assume cluster resource is 100G in memory
      2. Assume queue-1 has max resource limit 20G
      3. queue-1-1 is active and it will get max 20G memory(equal to its fairshare)
      4. queue-2 is active then, and it require 30G memory(less than its fairshare)
      5. queue-3 is active, and it can be assigned with all other resources, 50G memory(larger than its fairshare). At here three queues' fair share is (20, 40, 40), and usage is (20, 30, 50)
      6. queue-1-2 is active, it will cause new preemption request(10G memory and intuitively it can only preempt from its sibling queue-1-1)
      7. Actually preemption starts from root, and it will find queue-3 is most over fairshare, and preempt some resources form queue-3.
      8. But during scheduling, it will find queue-1 itself arrived it's max fairshare, and cannot assign resource to it. Then resource's again assigned to queue-3
        And then it repeats between last two steps.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              peng.zhang Peng Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: