[MAPREDUCE-2205] FairScheduler should not re-schedule jobs that have just been preempted - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Not A Problem
Affects Version/s: None
Fix Version/s: None
Component/s: contrib/fair-share
Labels:
None

Description

We have hit a problem with the preemption implementation in the FairScheduler where the following happens:

job X runs short of fair share or min share and requests/causes N tasks to be preempted
when slots are then scheduled - tasks from some other job are actually scheduled
after preemption_interval has passed, job X finds it's still underscheduled and requests preemption. goto 1.

This has caused widespread preemption of tasks and the cluster going from high utilization to low utilization in a few minutes.

After doing some analysis of the logs - one of the biggest contributing factors seems to be the scheduling of jobs when a heartbeat with multiple slots is advertised. currently it goes over all the jobs/pools (in sorted) order until all the slots are exhausted. this leads to lower priority jobs also getting scheduled (that may have just been preempted).

Attachments

Issue Links

relates to

MAPREDUCE-1204 Fair Scheduler preemption may preempt tasks running in slots unusable by the preempting job

Open

Activity

People

Assignee:: Scott Chen

Reporter:: Joydeep Sen Sarma

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 30/Nov/10 02:18

Updated:: 16/Dec/10 19:51

Resolved:: 16/Dec/10 19:51