Details
-
Task
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
None
-
None
-
None
Description
A benchmark that runs many small MapReduce tasks in sequence. A single map reduce implementation is used, it is invoked multiple times with input as the output from previous run. The input to first Map is a TextInputFormat ( a text file with few hundred KBs). Input records are passed to output without much processing. The idea is to benchmark the time taken by initialization of Mapper and Reducer. An initial prototyping on a single machine with 20 MR tasks in sequence took ~47 seconds per task. Looking for suggestions on what else can be included in the benchmark.
Attachments
Attachments
Issue Links
- incorporates
-
HADOOP-434 Use Hadoop scripts to run smallJobsBenchmark to avoid classpath issues.
- Closed
- is blocked by
-
HADOOP-460 Small jobs benchmark fails with current Hadoop due to UTF8 -> Text ClassCastException
- Closed