Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-45

[Preemption] Scheduler feedback to AM to release containers

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 2.1.0-beta
    • resourcemanager
    • None
    • Reviewed

    Description

      The ResourceManager strikes a balance between cluster utilization and strict enforcement of resource invariants in the cluster. Individual allocations of containers must be reclaimed- or reserved- to restore the global invariants when cluster load shifts. In some cases, the ApplicationMaster can respond to fluctuations in resource availability without losing the work already completed by that task (MAPREDUCE-4584). Supplying it with this information would be helpful for overall cluster utilization [1]. To this end, we want to establish a protocol for the RM to ask the AM to release containers.

      [1] http://research.yahoo.com/files/yl-2012-003.pdf

      Attachments

        1. YARN-45_design_thoughts.pdf
          24 kB
          Carlo Curino
        2. YARN-45.1.patch
          48 kB
          Carlo Curino
        3. YARN-45.patch
          46 kB
          Carlo Curino
        4. YARN-45.patch
          16 kB
          Carlo Curino
        5. YARN-45.patch
          13 kB
          Carlo Curino
        6. YARN-45.patch
          14 kB
          Carlo Curino
        7. YARN-45.patch
          18 kB
          Carlo Curino
        8. YARN-45.patch
          15 kB
          Carlo Curino

        Issue Links

          1.
          RM changes to support preemption for FairScheduler and CapacityScheduler Sub-task Closed Carlo Curino  
          2.
          CapacityScheduler: support for preemption (using a capacity monitor) Sub-task Closed Carlo Curino  
          3.
          FairScheduler: support for work-preserving preemption Sub-task Closed Carlo Curino  
          4.
          Expose preemption warnings in AMRMClient Sub-task Open Sandy Ryza  
          5.
          User guide for preemption Sub-task Resolved Unassigned  
          6.
          Preemption caused Invalid State Event: ACQUIRED at KILLED and caused a task timeout for 30mins Sub-task Closed Sunil G  
          7.
          Preempting an Application Master container can be kept as least priority when multiple applications are marked for preemption by ProportionalCapacityPreemptionPolicy Sub-task Closed Sunil G  
          8.
          observeOnly should be checked before any preemption computation started inside containerBasedPreemptOrKill() of ProportionalCapacityPreemptionPolicy.java Sub-task Resolved Unassigned

          0%

          Original Estimate - 1m
          Remaining Estimate - 1m
          9.
          ProportionalCapacitPreemptionPolicy handling of corner cases... Sub-task Closed Carlo Curino  
          10.
          Preemption message shouldn’t be created multiple times for same container-id in ProportionalCapacityPreemptionPolicy Sub-task Resolved Wangda Tan  
          11.
          ClassCastException is thrown during preemption When a huge job is submitted to a queue B whose resources is used by a job in queueA Sub-task Closed Christopher Douglas  
          12.
          ContainerExistStatus should define a status for preempted containers Sub-task Closed Alejandro Abdelnur  
          13.
          Add preemption to CS Sub-task Resolved Arun Murthy  
          14.
          Inconsistent picture of how a container was killed when querying RM and NM in case of preemption Sub-task Open Unassigned  
          15.
          Preemption: Jobs are failing due to AMs are getting launched and killed multiple times Sub-task Resolved Unassigned  
          16.
          Disable preemption at Queue level Sub-task Closed Eric Payne  
          17.
          CS queue level preemption should respect user-limits Sub-task Open Mayank Bansal  
          18.
          Preemption of AM containers shouldn't count towards AM failures Sub-task Closed Jian He  
          19.
          Preemption can prevent progress in small queues Sub-task Open Wangda Tan  
          20.
          Capacity scheduler preemption policy should respect yarn.scheduler.minimum-allocation-mb when computing resource of queues Sub-task Open Unassigned  
          21.
          Indicate preemption timout along with the list of containers to AM (preemption message) Sub-task Open Sunil G  
          22.
          Introducing CANCEL_PREEMPTION event to notify Scheduler and AM when a container is no longer to be preempted Sub-task Open Sunil G  
          23.
          CapacityScheduler: Improve preemption to only kill containers that would satisfy the incoming request Sub-task Resolved Wangda Tan  
          24.
          Refactor existing Preemption Policy of CS for easier adding new approach to select preemption candidates Sub-task Resolved Wangda Tan  
          25.
          Do surgical preemption based on reserved container in CapacityScheduler Sub-task Resolved Wangda Tan  

          Activity

            People

              curino Carlo Curino
              cdouglas Christopher Douglas
              Votes:
              0 Vote for this issue
              Watchers:
              30 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 1m
                  1m
                  Remaining:
                  Remaining Estimate - 1m
                  1m
                  Logged:
                  Time Spent - Not Specified
                  Not Specified