Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2256

FairScheduler fairshare preemption from multiple pools may preempt all tasks from one pool causing that pool to go below fairshare.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.1, 0.22.0
    • Fix Version/s: 0.22.0
    • Component/s: contrib/fair-share
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Scenarios:
      You have a cluster with 600 map slots and 3 pools. Fairshare for each pool is 200 to start with. Fairsharepreemption timeout is 5 mins.
      1) Pool1 schedules 300 map tasks first
      2) Pool2 then schedules another 300 map tasks
      3) Pool3 demands 300 map tasks but doesn't get any slot as all slots are taken.
      4) After 5 mins pool3 should preempt 200 map-slots. Instead of peempting 100 slots each from pool1 and pool2, the bug would cause it to preempt all 200 slots from pool2 (last started) causing it to go below fairshare. This is happening because the preemptTask method is not reducing the tasks left from a pool while preempting the tasks.

      The above scenario could be an extreme case but some amount of excess preemption would happen because of this bug.

      The patch I created was for 0.22.0 but the code fix should work on 0.21 as well as looks like it has the same bug.

        Activity

        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #643 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #643 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/ )
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-22-branch #33 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-22-branch/33/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-22-branch #33 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-22-branch/33/ )
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #595 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/595/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #595 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/595/ )
        Hide
        Konstantin Shvachko added a comment -

        I just committed this. Thank you Priyo.

        Show
        Konstantin Shvachko added a comment - I just committed this. Thank you Priyo.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12468031/mapreduce-2256_0_22.txt
        against trunk revision 1064287.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        +1 system test framework. The patch passed system test framework compile.

        Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/19//testReport/
        Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/19//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/19//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12468031/mapreduce-2256_0_22.txt against trunk revision 1064287. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/19//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/19//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/19//console This message is automatically generated.
        Hide
        Matei Zaharia added a comment -

        Agreed on putting it in 0.22.

        Show
        Matei Zaharia added a comment - Agreed on putting it in 0.22.
        Hide
        Priyo Mustafi added a comment -

        This is an important bug in the FairScheduler. I think this needs to be fixed in 0.22

        Show
        Priyo Mustafi added a comment - This is an important bug in the FairScheduler. I think this needs to be fixed in 0.22
        Hide
        Matei Zaharia added a comment -

        +1 looks good to me. Let's wait for Hudson to run its tests, and I'll commit it to trunk, 0.22 and 0.21 unless anyone brings up concerns.

        Show
        Matei Zaharia added a comment - +1 looks good to me. Let's wait for Hudson to run its tests, and I'll commit it to trunk, 0.22 and 0.21 unless anyone brings up concerns.
        Hide
        Todd Lipcon added a comment -

        Looks good to me, good catch, Priyo. +1 pending test results (and should probably give Matei a day or two to take a look as well)

        Show
        Todd Lipcon added a comment - Looks good to me, good catch, Priyo. +1 pending test results (and should probably give Matei a day or two to take a look as well)
        Hide
        Priyo Mustafi added a comment -

        Patch

        Show
        Priyo Mustafi added a comment - Patch

          People

          • Assignee:
            Priyo Mustafi
            Reporter:
            Priyo Mustafi
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development