[MAPREDUCE-1220] Implement an in-cluster LocalJobRunner - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Major
Resolution: Duplicate
Affects Version/s: None
Fix Version/s: 0.23.0
Component/s: client, jobtracker
Labels:
None

Release Note:
An efficient implementation of small jobs by running all tasks in the same JVM, there-by effecting lower latency.

Description

Currently very small map-reduce jobs suffer from latency issues due to overheads in Hadoop Map-Reduce such as scheduling, jvm startup etc. We've periodically tried to optimize all parts of framework to achieve lower latencies.

I'd like to turn the problem around a little bit. I propose we allow very small jobs to run as a single task job with multiple maps and reduces i.e. similar to our current implementation of the LocalJobRunner. Thus, under certain conditions (maybe user-set configuration, or if input data is small i.e. less a DFS blocksize) we could launch a special task which will run all maps in a serial manner, followed by the reduces. This would really help small jobs achieve significantly smaller latencies, thanks to lesser scheduling overhead, jvm startup, lack of shuffle over the network etc.

This would be a huge benefit, especially on large clusters, to small Hive/Pig queries.

Thoughts?

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

MAPREDUCE-1220_yhadoop20.patch
11/Feb/10 03:07
45 kB
Arun Murthy
MR-1220.v1.trunk-hadoop-common.Progress-dumper.patch.txt
08/Mar/11 16:05
2 kB
Greg Roelofs
MR-1220.v10e-v11c-v12b.ytrunk-hadoop-mapreduce.delta.patch.txt
08/Mar/11 16:18
73 kB
Greg Roelofs
MR-1220.v13.ytrunk-hadoop-mapreduce.delta.patch.txt
08/Mar/11 16:18
5 kB
Greg Roelofs
MR-1220.v14b.ytrunk-hadoop-mapreduce.delta.patch.txt
08/Mar/11 16:20
4 kB
Greg Roelofs
MR-1220.v15.ytrunk-hadoop-mapreduce.delta.patch.txt
08/Mar/11 16:23
13 kB
Greg Roelofs
MR-1220.v1b.sshot-02-jobdetails.jsp.png
08/Mar/11 22:59
80 kB
Greg Roelofs
MR-1220.v1b.sshot-03-jobdetailshistory.jsp.png
08/Mar/11 23:06
159 kB
Greg Roelofs
MR-1220.v2.trunk-hadoop-mapreduce.patch.txt
09/Sep/10 01:51
50 kB
Greg Roelofs
MR-1220.v2.trunk-hadoop-mapreduce.patch.txt
08/Sep/10 22:04
37 kB
Greg Roelofs
MR-1220.v2b.sshot-01-jobtracker.jsp.png
08/Mar/11 22:57
218 kB
Greg Roelofs
MR-1220.v6.ytrunk-hadoop-mapreduce.patch.txt
08/Mar/11 16:09
180 kB
Greg Roelofs
MR-1220.v7.ytrunk-hadoop-mapreduce.delta.patch.txt
08/Mar/11 16:10
0.5 kB
Greg Roelofs
MR-1220.v8b.ytrunk-hadoop-mapreduce.delta.patch.txt
08/Mar/11 16:15
12 kB
Greg Roelofs
MR-1220.v9c.ytrunk-hadoop-mapreduce.delta.patch.txt
08/Mar/11 16:16
29 kB
Greg Roelofs

Issue Links

is related to

MAPREDUCE-2405 MR-279: Implement uber-AppMaster (in-cluster LocalJobRunner for MRv2)

Closed

Activity

People

Assignee:: Greg Roelofs

Reporter:: Arun Murthy

Votes:: 3 Vote for this issue

Watchers:: 35 Start watching this issue

Dates

Created:: 18/Nov/09 22:44

Updated:: 15/Nov/11 00:48

Resolved:: 18/Oct/11 06:53