[HADOOP-307] Many small jobs benchmark for MapReduce - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Task
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.5.0
Component/s: None
Labels:
None

Description

A benchmark that runs many small MapReduce tasks in sequence. A single map reduce implementation is used, it is invoked multiple times with input as the output from previous run. The input to first Map is a TextInputFormat ( a text file with few hundred KBs). Input records are passed to output without much processing. The idea is to benchmark the time taken by initialization of Mapper and Reducer. An initial prototyping on a single machine with 20 MR tasks in sequence took ~47 seconds per task. Looking for suggestions on what else can be included in the benchmark.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

patch.txt
06/Aug/06 09:19
5 kB
Sanjay Dahiya
patch.txt
17/Jul/06 18:06
21 kB
Sanjay Dahiya
patch.txt
12/Jul/06 19:43
21 kB
Sanjay Dahiya

Issue Links

incorporates

HADOOP-434 Use Hadoop scripts to run smallJobsBenchmark to avoid classpath issues.

Closed

is blocked by

HADOOP-460 Small jobs benchmark fails with current Hadoop due to UTF8 -> Text ClassCastException

Closed

Activity

People

Assignee:: Sanjay Dahiya

Reporter:: Sanjay Dahiya

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 18/Jun/06 20:14

Updated:: 08/Jul/09 16:51

Resolved:: 18/Jul/06 11:03