Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2261

Fair Multiple Task Assignment Scheduler (Assigning multiple tasks per heart beat)

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Invalid
    • Affects Version/s: 0.21.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Functionality wise the Fair Multiple Task Assignment Scheduler behaves the same way except the assignment of Tasks. Instead of assigning a single Task per heartbeat, it checks for all the jobs if any local or non-local Task that can be launched.

      Fair Multiple Task Assignment Scheduler has the advantage of assigning multiple jobs per heart beat interval depending upon the slots available on the Task Tracker, by configuring the number of parallel tasks to be executed in a Task Tracker at any point of time. The advantages are as follows:

      a) Parallel Execution allows tasks be to submitted and processed in parallel independent of the status of other tasks.
      b) More number of tasks is assigned in a heartbeat interval and consequently multitasking capability increases.
      c) With multi task assignment, Task Tracker efficiency is increased.

        Activity

        Hide
        Devaraj K added a comment -

        I have validated all features against trunk and all are available. Thanks Todd and Lianhui for giving me the required information.

        Show
        Devaraj K added a comment - I have validated all features against trunk and all are available. Thanks Todd and Lianhui for giving me the required information.
        Hide
        Lianhui Wang added a comment -

        Todd,thank you.i see the code.
        outside of the for{},it has while{} that does the assign the capacity tasks.

        Show
        Lianhui Wang added a comment - Todd,thank you.i see the code. outside of the for{},it has while{} that does the assign the capacity tasks.
        Hide
        Todd Lipcon added a comment -

        MAPREDUCE-706 was a poorly named JIRA. It actually rewrote much of the fair scheduler, including adding this feature. In trunk, look at the usage of the mapAssignCap and reduceAssignCap variables. You'll see that assignTasks loops until the specified number of tasks have been assigned or the load manager indicates that no more can be assigned.

        Show
        Todd Lipcon added a comment - MAPREDUCE-706 was a poorly named JIRA. It actually rewrote much of the fair scheduler, including adding this feature. In trunk, look at the usage of the mapAssignCap and reduceAssignCap variables. You'll see that assignTasks loops until the specified number of tasks have been assigned or the load manager indicates that no more can be assigned.
        Hide
        Lianhui Wang added a comment -

        MAPREDUCE-706 supports the FIFO pools instead of fair sharing.it only add the FIFO to the pools.
        but is doesnot assign multiple tasks per heart beat.

        Show
        Lianhui Wang added a comment - MAPREDUCE-706 supports the FIFO pools instead of fair sharing.it only add the FIFO to the pools. but is doesnot assign multiple tasks per heart beat.
        Hide
        Lianhui Wang added a comment -

        hi,Todd.
        i see the code in the trunk version.
        the trunk doesnot include the patch of MAPREDUCE-706?

        Show
        Lianhui Wang added a comment - hi,Todd. i see the code in the trunk version. the trunk doesnot include the patch of MAPREDUCE-706 ?
        Hide
        Todd Lipcon added a comment -

        I believe you're both looking at the fair scheduler code from 0.20. MAPREDUCE-706 rewrote a lot of the fair scheduler and includes this feature.

        Show
        Todd Lipcon added a comment - I believe you're both looking at the fair scheduler code from 0.20. MAPREDUCE-706 rewrote a lot of the fair scheduler and includes this feature.
        Hide
        Lianhui Wang added a comment -

        i agree with Devaraj K.
        in the current version, the fair scheduler assign only one task int per hear beat.
        in the class FairScheduler we can see the following:
        assignTasks(TaskTracker tracker){
        for (Schedulable sched: scheds) {
        Task task = taskType == TaskType.MAP ?
        sched.assignTask(tts, currentTime, visitedForMap) :
        sched.assignTask(tts, currentTime, visitedForReduce);
        if (task != null)

        { tasks.add(task); break; // This break makes this loop assign only one task }

        }
        }
        if it assgin one task,the function will return.

        Show
        Lianhui Wang added a comment - i agree with Devaraj K. in the current version, the fair scheduler assign only one task int per hear beat. in the class FairScheduler we can see the following: assignTasks(TaskTracker tracker){ for (Schedulable sched: scheds) { Task task = taskType == TaskType.MAP ? sched.assignTask(tts, currentTime, visitedForMap) : sched.assignTask(tts, currentTime, visitedForReduce); if (task != null) { tasks.add(task); break; // This break makes this loop assign only one task } } } if it assgin one task,the function will return.
        Hide
        Devaraj K added a comment -

        Hi Todd,

        Existing fair scheduler assigns a single job per heart beat even more slots are available in the task tracker. Next job will be assigned to the same task tracker for the next heart beat only. This way we can assign multiple tasks but we cannot assign multiple tasks for the same heart beat.

        Fair Multi Task assignment scheduler assigns multiple jobs to the task tracker as in the below way.

        If number of jobs in the queue is less than the cluster capacity or equal, then it calculates the no of jobs can be assigned to the each task tracker (shared equally manner) and it will assign multiple jobs to the each task tracker for a heart beat.

        If number of jobs in the queue is more than the cluster capacity, then it assigns multiple jobs to the task tracker based on the task tracker capacity for a heart beat. This repeats for all the task trackers in the cluster.

        Show
        Devaraj K added a comment - Hi Todd, Existing fair scheduler assigns a single job per heart beat even more slots are available in the task tracker. Next job will be assigned to the same task tracker for the next heart beat only. This way we can assign multiple tasks but we cannot assign multiple tasks for the same heart beat. Fair Multi Task assignment scheduler assigns multiple jobs to the task tracker as in the below way. If number of jobs in the queue is less than the cluster capacity or equal, then it calculates the no of jobs can be assigned to the each task tracker (shared equally manner) and it will assign multiple jobs to the each task tracker for a heart beat. If number of jobs in the queue is more than the cluster capacity, then it assigns multiple jobs to the task tracker based on the task tracker capacity for a heart beat. This repeats for all the task trackers in the cluster.
        Hide
        Todd Lipcon added a comment -

        How does this differ from the assignmultiple feature present in the fairscheduler in trunk?

        Show
        Todd Lipcon added a comment - How does this differ from the assignmultiple feature present in the fairscheduler in trunk?

          People

          • Assignee:
            Unassigned
            Reporter:
            Devaraj K
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development