Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5139

[Umbrella] Move YARN scheduler towards global scheduler

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Existing YARN scheduler is based on node heartbeat. This can lead to sub-optimal decisions because scheduler can only look at one node at the time when scheduling resources.

      Pseudo code of existing scheduling logic looks like:

      for node in allNodes:
         Go to parentQueue
            Go to leafQueue
              for application in leafQueue.applications:
                 for resource-request in application.resource-requests
                    try to schedule on node
      

      Considering future complex resource placement requirements, such as node constraints (give me "a && b || c") or anti-affinity (do not allocate HBase regionsevers and Storm workers on the same host), we may need to consider moving YARN scheduler towards global scheduling.

        Attachments

        1. wip-1.YARN-5139.patch
          76 kB
          Wangda Tan
        2. wip-2.YARN-5139.patch
          95 kB
          Wangda Tan
        3. wip-3.YARN-5139.patch
          133 kB
          Wangda Tan
        4. YARN-5139-Global-Schedulingd-esign-and-implementation-notes.pdf
          245 kB
          Wangda Tan
        5. YARN-5139-Global-Schedulingd-esign-and-implementation-notes-v2.pdf
          285 kB
          Wangda Tan
        6. wip-4.YARN-5139.patch
          447 kB
          Wangda Tan
        7. Explanantions of Global Scheduling (YARN-5139) Implementation.pdf
          283 kB
          Wangda Tan
        8. YARN-5139-Concurrent-scheduling-performance-report.pdf
          96 kB
          Wangda Tan
        9. wip-5.YARN-5139.patch
          386 kB
          Wangda Tan
        10. YARN-5139.000.patch
          402 kB
          Wangda Tan

          Issue Links

          1.
          Add global scheduler interface definition and update CapacityScheduler to use it. Sub-task Resolved Wangda Tan
          2.
          Update AppSchedulingInfo to use SchedulingPlacementSet Sub-task Resolved Wangda Tan
          3.
          Introduce api independent PendingAsk to replace usage of ResourceRequest within Scheduler classes Sub-task Resolved Wangda Tan
          4.
          Update javadocs of new added APIs / classes of scheduler/AppSchedulingInfo Sub-task Open Wangda Tan
          5.
          Should consider utilization of each ResourceType on node while scheduling Sub-task Open Unassigned
          6.
          Global scheduler applies to Fair scheduler Sub-task Open Zhaohui Xin
          7.
          Rename PlacementSet and SchedulingPlacementSet Sub-task Resolved Wangda Tan
          8.
          Additional changes to make SchedulingPlacementSet agnostic to ResourceRequest / placement algorithm Sub-task Resolved Wangda Tan
          9.
          Delay scheduling should be an individual policy instead of part of scheduler implementation Sub-task Open Tao Yang
          10.
          Add muti-node lookup mechanism and pluggable nodes sorting policies to optimize placement decision Sub-task Resolved Sunil Govindan
          11.
          Introduce scheduler specific environment variable support in ApplicationSubmissionContext for better scheduling placement configurations Sub-task Resolved Sunil Govindan
          12.
          Pending backlog for async allocation threads should be configurable Sub-task Resolved Tao Yang
          13.
          The capacity scheduler logs too frequently seriously affecting performance Sub-task Open YunFan Zhou
          14.
          Resource leak caused by a reserved container being released more than once under async scheduling Sub-task Resolved Tao Yang
          15.
          Support dynamic policy updates in Capacity Scheduler Sub-task Open Unassigned
          16.
          Exclude lagged/unhealthy/decommissioned nodes in async allocating thread Sub-task Open Unassigned
          17.
          Add multi-thread asynchronous scheduling to fair scheduler Sub-task Open Unassigned

            Activity

              People

              • Assignee:
                leftnoteasy Wangda Tan
                Reporter:
                leftnoteasy Wangda Tan
              • Votes:
                5 Vote for this issue
                Watchers:
                78 Start watching this issue

                Dates

                • Created:
                  Updated: