[MAPREDUCE-548] Global scheduling in the Fair Scheduler - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Major
Resolution: Duplicate
Affects Version/s: None
Fix Version/s: 0.21.0
Component/s: None
Labels:
None

Description

The current schedulers in Hadoop all examine a single job on every heartbeat when choosing which tasks to assign, choosing the job based on FIFO or fair sharing. There are inherent limitations to this approach. For example, if the job at the front of the queue is small (e.g. 10 maps, in a cluster of 100 nodes), then on average it will launch only one local map on the first 10 heartbeats while it is at the head of the queue. This leads to very poor locality for small jobs. Instead, we need a more "global" view of scheduling that can look at multiple jobs. To resolve the locality problem, we will use the following algorithm:

If the job at the head of the queue has no node-local task to launch, skip it and look through other jobs.
If a job has waited at least T1 seconds while being skipped, also allow it to launch rack-local tasks.
If a job has waited at least T2 > T1 seconds, also allow it to launch off-rack tasks.
This algorithm improves locality while bounding the delay that any job experiences in launching a task.

It turns out that whether waiting is useful depends on how many tasks are left in the job - the probability of getting a heartbeat from a node with a local task - and on whether the job is CPU or IO bound. Thus there may be logic for removing the wait on the last few tasks in the job.

As a related issue, once we allow global scheduling, we can launch multiple tasks per heartbeat, as in ~~HADOOP-3136~~. The initial implementation of ~~HADOOP-3136~~ adversely affected performance because it only launched multiple tasks from the same job, but with the wait rule above, we will only do this for jobs that are allowed to launch non-local tasks.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

mapreduce-548-v4.patch
22/Jul/09 20:23
59 kB
Matei Alexandru Zaharia
mapreduce-548-v3.patch
17/Jul/09 23:21
59 kB
Matei Alexandru Zaharia
mapreduce-548-v2.patch
08/Jul/09 23:53
59 kB
Matei Alexandru Zaharia
mapreduce-548-v1.patch
03/Jul/09 17:00
58 kB
Matei Alexandru Zaharia
mapreduce-548.patch
03/Jul/09 04:23
56 kB
Matei Alexandru Zaharia
hadoop-4667-v2.patch
21/Feb/09 08:25
55 kB
Matei Alexandru Zaharia
hadoop-4667-v1b.patch
09/Feb/09 00:02
56 kB
Matei Alexandru Zaharia
hadoop-4667-v1.patch
05/Feb/09 23:05
56 kB
Matei Alexandru Zaharia
HADOOP-4667_api.patch
21/Jan/09 06:43
5 kB
Arun Murthy
fs-global-v0.patch
08/Jan/09 07:03
67 kB
Matei Alexandru Zaharia

Issue Links

is related to

MAPREDUCE-312 Port HADOOP-4667 to the default Map-Reduce scheduler

Resolved

Activity

People

Assignee:: Matei Alexandru Zaharia

Reporter:: Matei Alexandru Zaharia

Votes:: 0 Vote for this issue

Watchers:: 17 Start watching this issue

Dates

Created:: 16/Nov/08 00:09

Updated:: 24/Aug/10 21:13

Resolved:: 14/Aug/09 16:33