Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-2151

FairScheduler option for global preemption within hierarchical queues

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • None
    • None
    • fairscheduler

    Description

      FairScheduler has hierarchical queues, but fair share calculation and
      preemption still works withing a limited range and effectively still nonhierarchical.

      This patch solves this incompleteness in two aspects:

      1. Currently MinShare is not propagated to upper queue, that leads to
      fair share calculation ignores all Min Shares in deeper queues.
      Lets take an example
      (implemented as test case TestFairScheduler#testMinShareInHierarchicalQueues)

      <?xml version="1.0"?>
      <allocations>
      <queue name="queue1">
        <maxResources>10240mb, 10vcores</maxResources>
        <queue name="big"/>
        <queue name="sub1">
          <schedulingPolicy>fair</schedulingPolicy>
          <queue name="sub11">
            <minResources>6192mb, 6vcores</minResources>
          </queue>
        </queue>
        <queue name="sub2">
        </queue>
      </queue>
      </allocations>
      

      Then bigApp started within queue1.big with 10x1GB containers.
      That effectively eats all maximum allowed resources for queue1.
      Subsequent requests for app1 (queue1.sub1.sub11) and
      app2 (queue1.sub2) (5x1GB each) will wait for free resources.
      Take a note, that sub11 has min share requirements for 6x1GB.
      Without given patch fair share will be calculated with no knowledge
      about min share requirements and app1 and app2 will get equal
      number of containers.
      With the patch resources will split according to min share ( in test
      it will be 5 for app1 and 1 for app2)
      That behaviour controlled by the same parameter as ‘globalPreemtion’,
      but that can be changed easily.
      Implementation is a bit awkward, but seems that method for min share
      recalculation can be exposed as public or protected api and constructor
      in FSQueue can call it before using minShare getter. But right now
      current implementation with nulls should work too.

      2. Preemption doesn’t works between queues on different level for the
      queues hierarchy. Moreover, it is not possible to override various
      parameters for children queues.
      This patch adds parameter ‘globalPreemption’, which enables global
      preemption algorithm modifications.
      In a nutshell patch adds function shouldAttemptPreemption(queue),
      which can calculate usage for nested queues, and if queue with usage more
      that specified threshold is found, preemption can be triggered.
      Aggregated minShare does the rest of work and preemption will work
      as expected within hierarchy of queues with different MinShare/MaxShare
      specifications on different levels.

      Test case TestFairScheduler#testGlobalPreemption depicts how it works.
      One big app gets resources above its fair share and app1 has a declared
      min share. On submission code finds that starvation and preempts enough
      containers to give enough room for app1.

      Attachments

        1. YARN-2151.patch
          36 kB
          Andrey Stepachev

        Issue Links

          Activity

            People

              Unassigned Unassigned
              octo47 Andrey Stepachev
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: