Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-751

Rumen: a tool to extract job characterization data from job tracker logs



    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.21.0
    • tools/rumen
    • None
    • Reviewed
    • rumen,mumakil,job tracker logs


      We propose a new map/reduce component, rumen, which can be used to process job history logs to produce any or all of the following:

      • Retrospective info describing the statistical behavior of the
        amount of time it would have taken to launch a job into a certain
        percentage of the number of mapper slots in the log's cluster, given the
        load over the period covered by the log
      • Statistical info as to the runtimes and shuffle times, etc. of
        the tasks and jobs covered by the log
      • files describing detailed job trace information, and the
        network topology as inferred from the host locations and rack IDs that
        arise in the job tracker log. In addition to this facility, rumen
        includes readers for this information to return job and detailed task
        information to other tools.

      These other tools include a more advanced version of gridmix, and also includes mumak: see blocked issues.


        1. mapreduce-751-20090826.patch
          1.44 MB
          Hong Tang
        2. mapreduce-751-20090826.patch
          1.44 MB
          Christopher Douglas
        3. mapreduce-751--2009-07-23.patch
          1012 kB
          Dick King
        4. 2009-08-26--1513-patch.patch
          1.44 MB
          Dick King
        5. 2009-08-19--1030.patch
          1008 kB
          Dick King

        Issue Links



              dking Dick King
              dking Dick King
              0 Vote for this issue
              18 Start watching this issue