Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Vision:

      We want to build a Simulator to simulate large-scale Hadoop clusters, applications and workloads. This would be invaluable in furthering Hadoop by providing a tool for researchers and developers to prototype features (e.g. pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict their behaviour and performance with reasonable amount of confidence, there-by aiding rapid innovation.


      First Cut: Simulator for the Map-Reduce Scheduler

      The Map-Reduce Scheduler is a fertile area of interest with at least four schedulers, each with their own set of features, currently in existence: Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority Scheduler.

      Each scheduler's scheduling decisions are driven by many factors, such as fairness, capacity guarantee, resource availability, data-locality etc.

      Given that, it is non-trivial to accurately choose a single scheduler or even a set of desired features to predict the right scheduler (or features) for a given workload. Hence a simulator which can predict how well a particular scheduler works for some specific workload by quickly iterating over schedulers and/or scheduler features would be quite useful.

      So, the first cut is to implement a simulator for the Map-Reduce scheduler which take as input a job trace derived from production workload and a cluster definition, and simulates the execution of the jobs in as defined in the trace in this virtual cluster. As output, the detailed job execution trace (recorded in relation to virtual simulated time) could then be analyzed to understand various traits of individual schedulers (individual jobs turn around time, throughput, faireness, capacity guarantee, etc). To support this, we would need a simulator which could accurately model the conditions of the actual system which would affect a schedulers decisions. These include very large-scale clusters (thousands of nodes), the detailed characteristics of the workload thrown at the clusters, job or task failures, data locality, and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) etc.

      1. mapreduce-728-20090918-6.patch
        844 kB
        Hong Tang
      2. mapreduce-728-20090918-5.patch
        844 kB
        Hong Tang
      3. mapreduce-728-20090918-3.patch
        844 kB
        Hong Tang
      4. mapreduce-728-20090918-2.patch
        842 kB
        Hong Tang
      5. mapreduce-728-20090918.patch
        842 kB
        Hong Tang
      6. mapreduce-728-20090917-4.patch
        842 kB
        Hong Tang
      7. mapreduce-728-20090917-3.patch
        840 kB
        Hong Tang
      8. mapreduce-728-20090917.patch
        157 kB
        Hong Tang
      9. 19-jobs.trace.json.gz
        594 kB
        Hong Tang
      10. 19-jobs.topology.json.gz
        5 kB
        Hong Tang
      11. mumak.png
        44 kB
        Arun C Murthy

        Issue Links

          Activity

          Tom White made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Chris Douglas made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Fix Version/s 0.22.0 [ 12314184 ]
          Resolution Fixed [ 1 ]
          Chris Douglas made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Chris Douglas made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hong Tang made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Fix Version/s 0.22.0 [ 12314184 ]
          Hong Tang made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hong Tang made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hong Tang made changes -
          Attachment mapreduce-728-20090918-6.patch [ 12420122 ]
          Hong Tang made changes -
          Attachment mapreduce-728-20090918-6.patch [ 12420121 ]
          Hong Tang made changes -
          Attachment mapreduce-728-20090918-6.patch [ 12420121 ]
          Hong Tang made changes -
          Attachment mapreduce-728-20090918-4.patch [ 12420091 ]
          Hong Tang made changes -
          Attachment mapreduce-728-20090918-5.patch [ 12420093 ]
          Hong Tang made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hong Tang made changes -
          Attachment mapreduce-728-20090918-4.patch [ 12420089 ]
          Hong Tang made changes -
          Attachment mapreduce-728-20090918-4.patch [ 12420091 ]
          Hong Tang made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hong Tang made changes -
          Attachment mapreduce-728-20090918-4.patch [ 12420089 ]
          Hong Tang made changes -
          Attachment mapreduce-728-20090918-3.patch [ 12420088 ]
          Hong Tang made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hong Tang made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hong Tang made changes -
          Attachment mapreduce-728-20090918-2.patch [ 12420084 ]
          Hong Tang made changes -
          Attachment mapreduce-728-20090918.patch [ 12420077 ]
          Hong Tang made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hong Tang made changes -
          Link This issue relates to MAPREDUCE-729 [ MAPREDUCE-729 ]
          Hong Tang made changes -
          Link This issue relates to MAPREDUCE-1006 [ MAPREDUCE-1006 ]
          Hong Tang made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hong Tang made changes -
          Attachment mapreduce-728-20090917-4.patch [ 12419962 ]
          Hong Tang made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hong Tang made changes -
          Link This issue relates to MAPREDUCE-1001 [ MAPREDUCE-1001 ]
          Hong Tang made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hong Tang made changes -
          Attachment mapreduce-728-20090917-3.patch [ 12419958 ]
          Hong Tang made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hong Tang made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Affects Version/s 0.21.0 [ 12314045 ]
          Hong Tang made changes -
          Attachment mapreduce-728-20090917.patch [ 12419894 ]
          Hong Tang made changes -
          Attachment mapreduce-728-20090917.patch [ 12419924 ]
          Hong Tang made changes -
          Link This issue is blocked by MAPREDUCE-995 [ MAPREDUCE-995 ]
          Hong Tang made changes -
          Link This issue is blocked by MAPREDUCE-729 [ MAPREDUCE-729 ]
          Hong Tang made changes -
          Assignee Arun C Murthy [ acmurthy ] Hong Tang [ hong.tang ]
          Hong Tang made changes -
          Attachment 19-jobs.trace.json.gz [ 12419896 ]
          Attachment 19-jobs.topology.json.gz [ 12419895 ]
          Hong Tang made changes -
          Attachment mapreduce-728-20090917.patch [ 12419894 ]
          Dick King made changes -
          Link This issue is blocked by MAPREDUCE-751 [ MAPREDUCE-751 ]
          Arun C Murthy made changes -
          Attachment mumak.png [ 12412795 ]
          Arun C Murthy made changes -
          Field Original Value New Value
          Link This issue is blocked by MAPREDUCE-729 [ MAPREDUCE-729 ]
          Arun C Murthy created issue -

            People

            • Assignee:
              Hong Tang
              Reporter:
              Arun C Murthy
            • Votes:
              0 Vote for this issue
              Watchers:
              32 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development