[MAPREDUCE-4305] Implement delay scheduling in capacity scheduler for improving data locality - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

Capacity Scheduler data local tasks are about 40%-50% which is not good.
While my test with 70 node cluster i consistently get data locality around 40-50% on a free cluster.

I think we need to implement something like delay scheduling in the capacity scheduler for improving the data locality.
http://radlab.cs.berkeley.edu/publication/308

After implementing the delay scheduling on Hadoop 22 I am getting 100 % data locality in free cluster and around 90% data locality in busy cluster.

Thanks,
Mayank

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

PATCH-MAPREDUCE-4305-MR1-7.patch
07/May/13 00:00
35 kB
Mayank Bansal
PATCH-MAPREDUCE-4305-MR1-6.patch
06/May/13 21:49
35 kB
Mayank Bansal
PATCH-MAPREDUCE-4305-MR1-3.patch
15/Feb/13 21:20
40 kB
Mayank Bansal
PATCH-MAPREDUCE-4305-MR1-2.patch
14/Feb/13 00:23
39 kB
Mayank Bansal
PATCH-MAPREDUCE-4305-MR1-1.patch
10/Jan/13 00:42
43 kB
Mayank Bansal
PATCH-MAPREDUCE-4305-MR1.patch
09/Jan/13 01:30
43 kB
Mayank Bansal
MAPREDUCE-4305-1.patch
05/Jun/12 22:04
22 kB
Mayank Bansal
MAPREDUCE-4305
05/Jun/12 00:33
22 kB
Mayank Bansal

Issue Links

relates to

YARN-80 Support delay scheduling for node locality in MR2's capacity scheduler

Closed

Activity

People

Assignee:: Mayank Bansal

Reporter:: Mayank Bansal

Votes:: 0 Vote for this issue

Watchers:: 22 Start watching this issue

Dates

Created:: 01/Jun/12 22:14

Updated:: 08/Oct/13 03:00