Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1198

Alternatively schedule different types of tasks in fair share scheduler

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.21.0
    • Component/s: contrib/fair-share
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Matei has mentioned in MAPREDUCE-961 that the current scheduler will first try to launch map tasks until canLaunthTask() returns false then look for reduce tasks. This might starve reduce task. He also mention that alternatively schedule different types of tasks can solve this problem.

      1. MAPREDUCE-1198-v5.patch
        29 kB
        Scott Chen
      2. MAPREDUCE-1198-v4.patch
        29 kB
        Scott Chen
      3. MAPREDUCE-1198-v3.patch
        29 kB
        Scott Chen
      4. MAPREDUCE-1198-v2.patch
        29 kB
        Scott Chen
      5. MAPREDUCE-1198-v1.patch
        20 kB
        Scott Chen

        Activity

        Hide
        Scott Chen added a comment -

        In FairScheduler.assignTasks(), before scanning the jobs for a map and a reduce, we first sort the task type order according to the number of running maps and reduces on the TT. The original way will always first look at a map and then a reduce which may starve reduce tasks.

        This is basically an implementation of Matei's pseudo code mentioned in MAPREDUCE-961.

        Show
        Scott Chen added a comment - In FairScheduler.assignTasks(), before scanning the jobs for a map and a reduce, we first sort the task type order according to the number of running maps and reduces on the TT. The original way will always first look at a map and then a reduce which may starve reduce tasks. This is basically an implementation of Matei's pseudo code mentioned in MAPREDUCE-961 .
        Hide
        Matei Zaharia added a comment -

        Hi Scott,

        I don't think what you have is quite the right fix, because when a node has space to launch more than one task, the scheduler will still assign multiple tasks of the same type rather than alternating between them. In the pre-0.21 fair scheduler, where only one task of each type was assigned per heartbeat, this solution would've worked okay. However, in the current trunk, the scheduler can launch multiple tasks per heartbeat, and with your code, it will launch many maps or many reduces (depending on which one it had fewer of), but not both.

        The right thing would be to add a while loop outside the current for loop. On each iteration of the while loop, you'd order the task types (as

        {map, reduce}

        or

        {reduce, map}

        ) and then look for a single task to launch. You then stop the while loop when neither maps nor reduces have been launched in a particular iteration. That's what I tried to describe in my post on MAPREDUCE-961, sorry if it wasn't clear.

        Show
        Matei Zaharia added a comment - Hi Scott, I don't think what you have is quite the right fix, because when a node has space to launch more than one task, the scheduler will still assign multiple tasks of the same type rather than alternating between them. In the pre-0.21 fair scheduler, where only one task of each type was assigned per heartbeat, this solution would've worked okay. However, in the current trunk, the scheduler can launch multiple tasks per heartbeat, and with your code, it will launch many maps or many reduces (depending on which one it had fewer of), but not both. The right thing would be to add a while loop outside the current for loop. On each iteration of the while loop, you'd order the task types (as {map, reduce} or {reduce, map} ) and then look for a single task to launch. You then stop the while loop when neither maps nor reduces have been launched in a particular iteration. That's what I tried to describe in my post on MAPREDUCE-961 , sorry if it wasn't clear.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12425128/MAPREDUCE-1198-v1.patch
        against trunk revision 836063.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/246/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/246/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/246/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/246/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425128/MAPREDUCE-1198-v1.patch against trunk revision 836063. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/246/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/246/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/246/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/246/console This message is automatically generated.
        Hide
        Scott Chen added a comment -

        Thanks for the comments, Matei.
        I see. So I should consider one task at a time and alternate between task types. I will work on the new patch and post it soon.

        Show
        Scott Chen added a comment - Thanks for the comments, Matei. I see. So I should consider one task at a time and alternate between task types. I will work on the new patch and post it soon.
        Hide
        Scott Chen added a comment -

        I rewrite the loop such that it can assign only one task at a time. The loop will choose the task type to be assigned every time.

        Show
        Scott Chen added a comment - I rewrite the loop such that it can assign only one task at a time. The loop will choose the task type to be assigned every time.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12425272/MAPREDUCE-1198-v2.patch
        against trunk revision 881536.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/249/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/249/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/249/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/249/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425272/MAPREDUCE-1198-v2.patch against trunk revision 881536. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/249/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/249/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/249/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/249/console This message is automatically generated.
        Hide
        Scott Chen added a comment -

        The test fail is related to MAPREDUCE-915. It has been fixed in MAPREDUCE-687.

        @Matei: Can you help me review the new patch again? Thank you.

        Show
        Scott Chen added a comment - The test fail is related to MAPREDUCE-915 . It has been fixed in MAPREDUCE-687 . @Matei: Can you help me review the new patch again? Thank you.
        Hide
        Matei Zaharia added a comment -

        I'm taking a look at it. It looks okay so far, except I don't understand why you have if (mapRejected == true) as part of the condition to set mapRejected = true (and similar for reduceRejected). This looks like it won't change anything, but it will just write extra statements to the log.

        Show
        Matei Zaharia added a comment - I'm taking a look at it. It looks okay so far, except I don't understand why you have if (mapRejected == true) as part of the condition to set mapRejected = true (and similar for reduceRejected). This looks like it won't change anything, but it will just write extra statements to the log.
        Hide
        Scott Chen added a comment -

        My intention is that when a task type is rejected once, it should always be rejected. Since it is the first condition in || (or), if it is true the following condition checks will be skipped.

        But you are right, this will not change the logic if we remove it. If it is confusing, I think it is OK to remove it.

        Show
        Scott Chen added a comment - My intention is that when a task type is rejected once, it should always be rejected. Since it is the first condition in || (or), if it is true the following condition checks will be skipped. But you are right, this will not change the logic if we remove it. If it is confusing, I think it is OK to remove it.
        Hide
        Scott Chen added a comment -

        You are right about the LOG. I didn't aware that the LOG should write only on the first time. I will fix this and post it right away.

        Show
        Scott Chen added a comment - You are right about the LOG. I didn't aware that the LOG should write only on the first time. I will fix this and post it right away.
        Hide
        Scott Chen added a comment -

        I did a minor change on the reject condition check. Now it will not print the log repeatedly. Thanks for pointing this out, Matei.

        Show
        Scott Chen added a comment - I did a minor change on the reject condition check. Now it will not print the log repeatedly. Thanks for pointing this out, Matei.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12425418/MAPREDUCE-1198-v3.patch
        against trunk revision 881673.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/143/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/143/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/143/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/143/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425418/MAPREDUCE-1198-v3.patch against trunk revision 881673. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/143/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/143/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/143/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/143/console This message is automatically generated.
        Hide
        Matei Zaharia added a comment -

        I've looked at the patch in more detail. It mostly looks good, but I have a question: When no task is found in the loop over Schedulables (for (Schedulable sched: scheds)), why do you still increase the mapAssigningCount or reduceAssigningCount after the loop? My guess is that you're increasing it so that it reaches its cap and we stop trying to assign that type of task. However, it would be clearer if you just set mapRejected or reduceRejected to true when you don't find a task of that type. You could add a boolean variable outside the loop called foundTask, initialize it to false, and set it to true when you do find a task (before the break). Then, after the loop, if findTask == false, set mapRejected or reduceRejected to true.

        Also, a couple of smaller comments:

        • Instead of the names mapAssigningCount and reduceAssigningCount, you can use mapsAssigned and reducesAssigned.
        • In the inner loop, the launchedMap.add(job); statement that is inside an if (taskType == TaskType.MAP) could be moved below to the MAP case of the "update running task counts" statement.
        • "to prevent the type from starving" -> "to prevent that task type from starving"
        • Capitalize the this in "// this is the only exit of the while (true) loop"
        Show
        Matei Zaharia added a comment - I've looked at the patch in more detail. It mostly looks good, but I have a question: When no task is found in the loop over Schedulables (for (Schedulable sched: scheds)), why do you still increase the mapAssigningCount or reduceAssigningCount after the loop? My guess is that you're increasing it so that it reaches its cap and we stop trying to assign that type of task. However, it would be clearer if you just set mapRejected or reduceRejected to true when you don't find a task of that type. You could add a boolean variable outside the loop called foundTask, initialize it to false, and set it to true when you do find a task (before the break). Then, after the loop, if findTask == false, set mapRejected or reduceRejected to true. Also, a couple of smaller comments: Instead of the names mapAssigningCount and reduceAssigningCount, you can use mapsAssigned and reducesAssigned. In the inner loop, the launchedMap.add(job); statement that is inside an if (taskType == TaskType.MAP) could be moved below to the MAP case of the "update running task counts" statement. "to prevent the type from starving" -> "to prevent that task type from starving" Capitalize the this in "// this is the only exit of the while (true) loop"
        Hide
        Scott Chen added a comment -

        I agree with all these points. They are all good suggestions. Thank you for the careful review. I will start work on them.

        Show
        Scott Chen added a comment - I agree with all these points. They are all good suggestions. Thank you for the careful review. I will start work on them.
        Hide
        Scott Chen added a comment -

        I have revised the code according to Matei's suggestions.
        1. Initialize the rejection flags to true. So if the task is not found, it will be mark as rejected
        2. Change the counter to mapAssigned and reduceAssigned and move them in the proper places
        3. Move the initialization of the HashSet visited outside the loop to avoid repeated initialization
        4. Changes of the comments according to the suggestions.

        Show
        Scott Chen added a comment - I have revised the code according to Matei's suggestions. 1. Initialize the rejection flags to true. So if the task is not found, it will be mark as rejected 2. Change the counter to mapAssigned and reduceAssigned and move them in the proper places 3. Move the initialization of the HashSet visited outside the loop to avoid repeated initialization 4. Changes of the comments according to the suggestions.
        Hide
        Matei Zaharia added a comment -

        A couple of comments:

        • I think initializing the HashSet outside the while loop and clearing it on every for iteration is unnecessary optimization. I'd suggest initializing it in the while loop instead (i.e. have it be a variable right above the for (Schedulable sched: scheds) loop). Also, if you do this, you don't need to clear the set in each iteration of the for loop, and you can do the visitedForMap.addAll() after the for loop. It's okay if a job is added to the HashSet twice - after all, it is a Set. The only thing we want to avoid is having the same set used for maps and reduces.
        • It would be better to call the mapAssigned and reduceAssigned variables mapsAssigned and reducesAssigned instead, to make it clear that they're counts.
        • In my opinion setting mapRejected / reduceRejected = true and then unsetting it is more confusing than having an extra foundTask flag that is set to true when we find a task. I'd prefer the latter, because I want the code to be as easy as possible for a newcomer to understand. However, I don't have strong feelings about this.
        Show
        Matei Zaharia added a comment - A couple of comments: I think initializing the HashSet outside the while loop and clearing it on every for iteration is unnecessary optimization. I'd suggest initializing it in the while loop instead (i.e. have it be a variable right above the for (Schedulable sched: scheds) loop). Also, if you do this, you don't need to clear the set in each iteration of the for loop, and you can do the visitedForMap.addAll() after the for loop. It's okay if a job is added to the HashSet twice - after all, it is a Set. The only thing we want to avoid is having the same set used for maps and reduces. It would be better to call the mapAssigned and reduceAssigned variables mapsAssigned and reducesAssigned instead, to make it clear that they're counts. In my opinion setting mapRejected / reduceRejected = true and then unsetting it is more confusing than having an extra foundTask flag that is set to true when we find a task. I'd prefer the latter, because I want the code to be as easy as possible for a newcomer to understand. However, I don't have strong feelings about this.
        Hide
        Scott Chen added a comment -

        1. I have changed mapAssigned and reduceAssigned variables to mapsAssigned and reducesAssigned.
        2. I agree that using the foundTask flag is a better idea. It does make the code clear and the function is the same. So I also made the change.
        3. I still feel bad about creating a new hashmap for every task assigned. So in this patch there are two hashmap: one for map and the other dummy one for reduce. I don't know if you like this solution. If you think this is not good, we can change it back to your proposal.

        Show
        Scott Chen added a comment - 1. I have changed mapAssigned and reduceAssigned variables to mapsAssigned and reducesAssigned. 2. I agree that using the foundTask flag is a better idea. It does make the code clear and the function is the same. So I also made the change. 3. I still feel bad about creating a new hashmap for every task assigned. So in this patch there are two hashmap: one for map and the other dummy one for reduce. I don't know if you like this solution. If you think this is not good, we can change it back to your proposal.
        Hide
        Matei Zaharia added a comment -

        This looks good, the two hashsets are a good idea. I'll take another quick look at it tomorrow and commit it.

        Show
        Matei Zaharia added a comment - This looks good, the two hashsets are a good idea. I'll take another quick look at it tomorrow and commit it.
        Hide
        Scott Chen added a comment -

        Thanks a lot for all the help, Matei

        Show
        Scott Chen added a comment - Thanks a lot for all the help, Matei
        Hide
        Matei Zaharia added a comment -

        +1, looks good. I've committed the patch. Thanks Scott!

        Show
        Matei Zaharia added a comment - +1, looks good. I've committed the patch. Thanks Scott!
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #125 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/125/)
        . Alternatively schedule different types of tasks in fair share
        scheduler. Contributed by Scott Chen.

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #125 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/125/ ) . Alternatively schedule different types of tasks in fair share scheduler. Contributed by Scott Chen.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #162 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/162/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #162 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/162/ )

          People

          • Assignee:
            Scott Chen
            Reporter:
            Scott Chen
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development