Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2205

FairScheduler should not re-schedule jobs that have just been preempted

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • None
    • None
    • contrib/fair-share
    • None

    Description

      We have hit a problem with the preemption implementation in the FairScheduler where the following happens:

      1. job X runs short of fair share or min share and requests/causes N tasks to be preempted
      2. when slots are then scheduled - tasks from some other job are actually scheduled
      3. after preemption_interval has passed, job X finds it's still underscheduled and requests preemption. goto 1.

      This has caused widespread preemption of tasks and the cluster going from high utilization to low utilization in a few minutes.

      After doing some analysis of the logs - one of the biggest contributing factors seems to be the scheduling of jobs when a heartbeat with multiple slots is advertised. currently it goes over all the jobs/pools (in sorted) order until all the slots are exhausted. this leads to lower priority jobs also getting scheduled (that may have just been preempted).

      Attachments

        Issue Links

          Activity

            People

              schen Scott Chen
              jsensarma Joydeep Sen Sarma
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: