Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5139

[Umbrella] Move YARN scheduler towards global scheduler

Add voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Existing YARN scheduler is based on node heartbeat. This can lead to sub-optimal decisions because scheduler can only look at one node at the time when scheduling resources.

      Pseudo code of existing scheduling logic looks like:

      for node in allNodes:
         Go to parentQueue
            Go to leafQueue
              for application in leafQueue.applications:
                 for resource-request in application.resource-requests
                    try to schedule on node
      

      Considering future complex resource placement requirements, such as node constraints (give me "a && b || c") or anti-affinity (do not allocate HBase regionsevers and Storm workers on the same host), we may need to consider moving YARN scheduler towards global scheduling.

      Attachments

        1. YARN-5139.000.patch
          402 kB
          Wangda Tan
        2. wip-5.YARN-5139.patch
          386 kB
          Wangda Tan
        3. YARN-5139-Concurrent-scheduling-performance-report.pdf
          96 kB
          Wangda Tan
        4. Explanantions of Global Scheduling (YARN-5139) Implementation.pdf
          283 kB
          Wangda Tan
        5. wip-4.YARN-5139.patch
          447 kB
          Wangda Tan
        6. YARN-5139-Global-Schedulingd-esign-and-implementation-notes-v2.pdf
          285 kB
          Wangda Tan
        7. YARN-5139-Global-Schedulingd-esign-and-implementation-notes.pdf
          245 kB
          Wangda Tan
        8. wip-3.YARN-5139.patch
          133 kB
          Wangda Tan
        9. wip-2.YARN-5139.patch
          95 kB
          Wangda Tan
        10. wip-1.YARN-5139.patch
          76 kB
          Wangda Tan

        Issue Links

        1.
        Add global scheduler interface definition and update CapacityScheduler to use it. Sub-task Resolved Wangda Tan   Actions
        2.
        Update AppSchedulingInfo to use SchedulingPlacementSet Sub-task Resolved Wangda Tan   Actions
        3.
        Introduce api independent PendingAsk to replace usage of ResourceRequest within Scheduler classes Sub-task Resolved Wangda Tan   Actions
        4.
        Update javadocs of new added APIs / classes of scheduler/AppSchedulingInfo Sub-task Open Wangda Tan   Actions
        5.
        Should consider utilization of each ResourceType on node while scheduling Sub-task Open Qi Zhu   Actions
        6.
        Global scheduler applies to Fair scheduler Sub-task Open Zhaohui Xin   Actions
        7.
        Rename PlacementSet and SchedulingPlacementSet Sub-task Resolved Wangda Tan   Actions
        8.
        Additional changes to make SchedulingPlacementSet agnostic to ResourceRequest / placement algorithm Sub-task Resolved Wangda Tan   Actions
        9.
        Delay scheduling should be an individual policy instead of part of scheduler implementation Sub-task Open Tao Yang   Actions
        10.
        Add multi-node lookup mechanism and pluggable nodes sorting policies to optimize placement decision Sub-task Resolved Sunil G   Actions
        11.
        Introduce scheduler specific environment variable support in ApplicationSubmissionContext for better scheduling placement configurations Sub-task Resolved Sunil G   Actions
        12.
        Pending backlog for async allocation threads should be configurable Sub-task Resolved Tao Yang   Actions
        13.
        The capacity scheduler logs too frequently seriously affecting performance Sub-task Open YunFan Zhou   Actions
        14.
        Resource leak caused by a reserved container being released more than once under async scheduling Sub-task Resolved Tao Yang   Actions
        15.
        Support dynamic policy updates in Capacity Scheduler Sub-task Open Qi Zhu   Actions
        16.
        Exclude lagged/unhealthy/decommissioned nodes in async allocating thread Sub-task Resolved Qi Zhu

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 1h 20m
        Actions
        17.
        Add multi-thread asynchronous scheduling to fair scheduler Sub-task Open Unassigned   Actions
        18.
        Use threadPool to handle async scheduling threads Sub-task Open Aihua Xu   Actions
        19.
        Skip schedule on not heartbeated nodes in Multi Node Placement Sub-task Resolved Prabhu Joseph   Actions
        20.
        Proactively relocate allocated containers from a stopped node Sub-task Open Tanu Ajmera   Actions
        21.
        Reserved Containers not allocated from available space of other nodes in CandidateNodeSet in MultiNodePlacement Sub-task Resolved Prabhu Joseph   Actions
        22.
        Reserved Containers not allocated from available space of other nodes in CandidateNodeSet in MultiNodePlacement (YARN-10259) Sub-task Resolved Prabhu Joseph   Actions
        23.
        Support Multi Node Placement in SingleConstraintAppPlacementAllocator Sub-task Resolved Prabhu Joseph   Actions
        24.
        Import logic of multi-node allocation in CapacityScheduler Sub-task Resolved Qi Zhu

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 2h 10m
        Actions
        25.
        Merge YARN-8557 and YARN-10352, and rebase based YARN-10380. Sub-task Resolved Qi Zhu   Actions

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            leftnoteasy Wangda Tan
            leftnoteasy Wangda Tan

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 3.5h
                3.5h

                Slack

                  Issue deployment