Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-307

Many small jobs benchmark for MapReduce

    XMLWordPrintableJSON

Details

    • Task
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 0.5.0
    • None
    • None

    Description

      A benchmark that runs many small MapReduce tasks in sequence. A single map reduce implementation is used, it is invoked multiple times with input as the output from previous run. The input to first Map is a TextInputFormat ( a text file with few hundred KBs). Input records are passed to output without much processing. The idea is to benchmark the time taken by initialization of Mapper and Reducer. An initial prototyping on a single machine with 20 MR tasks in sequence took ~47 seconds per task. Looking for suggestions on what else can be included in the benchmark.

      Attachments

        1. patch.txt
          5 kB
          Sanjay Dahiya
        2. patch.txt
          21 kB
          Sanjay Dahiya
        3. patch.txt
          21 kB
          Sanjay Dahiya

        Issue Links

          Activity

            People

              sanjay.dahiya Sanjay Dahiya
              sanjay.dahiya Sanjay Dahiya
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: