Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1261

Enhance mumak to implement a 'stress-test' for the JobTracker

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: contrib/mumak
    • Labels:
      None

      Description

      I propose we enhance mumak to implement a proper 'stress-test' tool for the JobTracker. The idea is that we enhance mumak to have a mode where it can use the real JobTracker (and Scheduler of course) and mumak's SimulatedTaskTracker to run real workloads from production job-history traces. Clearly we will need to make necessary changes to allow the SimulatedTaskTrackers to run independently (a thread per SimulatedTT) in a distributed manner.

      We can then simulate very large clusters and workloads using a handful of machines (say ~50 machines to simulate workload which originally ran on a 4000 node cluster), also we can use this to stress the JobTracker with synthetic workloads.

      Thoughts?

        Activity

        Hide
        Matei Zaharia added a comment -

        Is a TaskTracker that's not doing any work really so resource-intensive that you need 50 machines to simulate a 4000-node cluster? It might be nice to figure out a more efficient way to do this than having one thread per TaskTracker so that it's possible to run these stress-tests without needing a 50-node cluster. For example, we might be able to run many TaskTrackers in one thread using asynchronous IO, if the RPC framework supports that.

        Show
        Matei Zaharia added a comment - Is a TaskTracker that's not doing any work really so resource-intensive that you need 50 machines to simulate a 4000-node cluster? It might be nice to figure out a more efficient way to do this than having one thread per TaskTracker so that it's possible to run these stress-tests without needing a 50-node cluster. For example, we might be able to run many TaskTrackers in one thread using asynchronous IO, if the RPC framework supports that.

          People

          • Assignee:
            Unassigned
            Reporter:
            Arun C Murthy
          • Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development