Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2905

CapBasedLoadManager incorrectly allows assignment when assignMultiple is true (was: assignmultiple per job)

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.2
    • Fix Version/s: 1.1.0
    • Component/s: contrib/fair-share
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      We encountered a situation where in the same cluster, large jobs benefit from mapred.fairscheduler.assignmultiple, but small jobs with small numbers of mappers do not: the mappers all clump to fully occupy just a few nodes, which causes those nodes to saturate and bottleneck. The desired behavior is to spread the job across more nodes so that a relatively small job doesn't saturate any node in the cluster.

      Testing has shown that setting mapred.fairscheduler.assignmultiple to false gives the desired behavior for small jobs, but is unnecessary for large jobs. However, since this is a cluster-wide setting, we can't properly tune.

      It'd be nice if jobs can set a param similar to mapred.fairscheduler.assignmultiple on submission to better control the task distribution of a particular job.

      1. ASF.LICENSE.NOT.GRANTED--screenshot-1.jpg
        58 kB
        Jeff Bean
      2. MR-2905.10-13-2011
        12 kB
        Jeff Bean
      3. MR-2905.patch
        4 kB
        Jeff Bean
      4. MR-2905.patch.2
        5 kB
        Jeff Bean
      5. mr-2905.txt
        11 kB
        Todd Lipcon
      6. mr-2905.txt
        10 kB
        Todd Lipcon

        Activity

        Hide
        Jeff Bean added a comment -

        Example:

        5 node cluster, 5 slots per node.

        Set assign multiple to true.

        Submit a sleep job with 7 mappers.

        All 7 mappers are assigned to nodes A and B.

        Set assign multiple to false. Bounce job tracker.

        Submit a sleep job with 7 mappers.

        7 mappers now spread evenly across cluster.

        I should be able to submit 2 sleep jobs, maybe job 1 has 7 mappers, job 2 has 700 mappers. I want to set assignmultiple on job 2 so that evey node saturates quickly, while job 1 has assignmultiple set to false, which means it gets one task assigned at a time and therefore spreads out more evently.

        Show
        Jeff Bean added a comment - Example: 5 node cluster, 5 slots per node. Set assign multiple to true. Submit a sleep job with 7 mappers. All 7 mappers are assigned to nodes A and B. Set assign multiple to false. Bounce job tracker. Submit a sleep job with 7 mappers. 7 mappers now spread evenly across cluster. I should be able to submit 2 sleep jobs, maybe job 1 has 7 mappers, job 2 has 700 mappers. I want to set assignmultiple on job 2 so that evey node saturates quickly, while job 1 has assignmultiple set to false, which means it gets one task assigned at a time and therefore spreads out more evently.
        Hide
        Todd Lipcon added a comment -

        There's code in the fair scheduler which tries to avoid doing this. I think something is broken about it, currently, though, since I've heard many people report the same behavior. Look at CapBasedLoadManager.java.

        Either way, I don't think doing this per-job is the right solution - better to avoid exposing the config, and just Do The Right Thing.

        Show
        Todd Lipcon added a comment - There's code in the fair scheduler which tries to avoid doing this. I think something is broken about it, currently, though, since I've heard many people report the same behavior. Look at CapBasedLoadManager.java. Either way, I don't think doing this per-job is the right solution - better to avoid exposing the config, and just Do The Right Thing.
        Hide
        Jeff Bean added a comment -

        CapBasedLoadManager looks OK, but the endless loop in the fair scheduler that evaluates the output of CapBasedLoadManager, in the method AssignTasks looks suspect to me.

        Show
        Jeff Bean added a comment - CapBasedLoadManager looks OK, but the endless loop in the fair scheduler that evaluates the output of CapBasedLoadManager, in the method AssignTasks looks suspect to me.
        Hide
        Jeff Bean added a comment -

        I was mistaken. CapBasedLoadManager isn't working.

        The problem is this:

        public boolean canAssignMap(TaskTrackerStatus tracker,
        int totalRunnableMaps, int totalMapSlots)

        { return tracker.countMapTasks() < getCap(totalRunnableMaps, tracker.getMaxMapSlots(), totalMapSlots); }

        The calls to tracker.countMapTasks() and tracker.countReduceTasks() always return 0 when called from the loop inside assignTasks. Even though tasks get assigned by the scheduler, countMapTasks() and countReduceTasks don't change what they return.

        Need to either fix countMapTasks() or determine another way to pass in the information about the tasks which were just assigned by the scheduler.

        Show
        Jeff Bean added a comment - I was mistaken. CapBasedLoadManager isn't working. The problem is this: public boolean canAssignMap(TaskTrackerStatus tracker, int totalRunnableMaps, int totalMapSlots) { return tracker.countMapTasks() < getCap(totalRunnableMaps, tracker.getMaxMapSlots(), totalMapSlots); } The calls to tracker.countMapTasks() and tracker.countReduceTasks() always return 0 when called from the loop inside assignTasks. Even though tasks get assigned by the scheduler, countMapTasks() and countReduceTasks don't change what they return. Need to either fix countMapTasks() or determine another way to pass in the information about the tasks which were just assigned by the scheduler.
        Hide
        Jeff Bean added a comment -

        Please review and validate the approach.

        The problem is that when AssignMultiple is turned on, canAssignMap and canAssignReduce gets called in a loop which causes the task tracker status to go out of date as new tasks are marked for assignment. Tasks get added to the list, but don't actually get assigned until AssignTasks exits. Hence, the task tracker status falls out of date.

        Patch modifies CapBasedLoadManager to track the number of times canAssignMap and canAssignReduce returns "true" to the same task tracker. It assumes that when it returns true, there's potentially another running task that needs to be considered when deciding whether we can assign a new task to this tracker.

        Show
        Jeff Bean added a comment - Please review and validate the approach. The problem is that when AssignMultiple is turned on, canAssignMap and canAssignReduce gets called in a loop which causes the task tracker status to go out of date as new tasks are marked for assignment. Tasks get added to the list, but don't actually get assigned until AssignTasks exits. Hence, the task tracker status falls out of date. Patch modifies CapBasedLoadManager to track the number of times canAssignMap and canAssignReduce returns "true" to the same task tracker. It assumes that when it returns true, there's potentially another running task that needs to be considered when deciding whether we can assign a new task to this tracker.
        Hide
        Jeff Bean added a comment -

        Verified that this fix works. Can someone please look at it?

        Show
        Jeff Bean added a comment - Verified that this fix works. Can someone please look at it?
        Hide
        Harsh J added a comment -

        Jeff,

        I'll leave the final review to people better suited to reviewing FairScheduler patches, but am gonna post some notes on getting this patch to an acceptable state:

        A few nits, hence:

        • Patch is mixing spaces and tabs. Follow the coding guidelines and use only spaces. 2 spaces per indent instead of hard tab characters which seem present right now.
        • If you'd like to get this included upstream, you'll have to re-up the patch with permission grants to ASF. This is doable when you attach a file (look for an option at the bottom – or perhaps you missed it accidentally).

        If possible, can we somehow have a test for this? Just asking.

        Show
        Harsh J added a comment - Jeff, I'll leave the final review to people better suited to reviewing FairScheduler patches, but am gonna post some notes on getting this patch to an acceptable state: A few nits, hence: Patch is mixing spaces and tabs. Follow the coding guidelines and use only spaces. 2 spaces per indent instead of hard tab characters which seem present right now. If you'd like to get this included upstream, you'll have to re-up the patch with permission grants to ASF. This is doable when you attach a file (look for an option at the bottom – or perhaps you missed it accidentally). If possible, can we somehow have a test for this? Just asking.
        Hide
        Jeff Bean added a comment -

        The test is a sleep job with slightly more mappers than slots on a node. Without the patch it clumps on one node. With the patch it evenly spreads. I also have a unit test that I used to discover the bug. Its a little messy. Let me know if you need that.

        Show
        Jeff Bean added a comment - The test is a sleep job with slightly more mappers than slots on a node. Without the patch it clumps on one node. With the patch it evenly spreads. I also have a unit test that I used to discover the bug. Its a little messy. Let me know if you need that.
        Hide
        Todd Lipcon added a comment -

        We do need a junit test. Manual tests are generally insufficient since there's little guarantee they won't regress.

        Show
        Todd Lipcon added a comment - We do need a junit test. Manual tests are generally insufficient since there's little guarantee they won't regress.
        Hide
        Jeff Bean added a comment -

        I don't understand why the existing JUnit test for CapBasedLoadManager isn't sufficient to determine that the patch does not regress? It was sufficient before, so if the patched LoadManager still passes the test on the build, shouldn't that be sufficient now?

        The JUnit test actually doesn't properly catch this bug. When AssignMultiple is on, CanAssignMap is called in a loop from the fair scheduler with the same TaskStatus object, and the junit test reinstantiates the task status object on each call to CanAssignMap.

        I can patch the existing JUnit test to reproduce the issue, which would make the existing CapBasedLoadManager fail. Is this sufficient?

        Show
        Jeff Bean added a comment - I don't understand why the existing JUnit test for CapBasedLoadManager isn't sufficient to determine that the patch does not regress? It was sufficient before, so if the patched LoadManager still passes the test on the build, shouldn't that be sufficient now? The JUnit test actually doesn't properly catch this bug. When AssignMultiple is on, CanAssignMap is called in a loop from the fair scheduler with the same TaskStatus object, and the junit test reinstantiates the task status object on each call to CanAssignMap. I can patch the existing JUnit test to reproduce the issue, which would make the existing CapBasedLoadManager fail. Is this sufficient?
        Hide
        Jeff Bean added a comment -

        Checked "grant for inclusion"
        Fixed tab v. space issue per harsh

        Show
        Jeff Bean added a comment - Checked "grant for inclusion" Fixed tab v. space issue per harsh
        Hide
        Todd Lipcon added a comment -

        That's exactly the point. If the existing test didn't fail, even though there was a bug, then the existing test wasn't good enough. Hence you need to add a new test or improve the existing one in such a way that the test fails without this bug fix and passes with it.

        Show
        Todd Lipcon added a comment - That's exactly the point. If the existing test didn't fail, even though there was a bug, then the existing test wasn't good enough. Hence you need to add a new test or improve the existing one in such a way that the test fails without this bug fix and passes with it.
        Hide
        Jeff Bean added a comment -

        Ok. That's different from testing for regressions but maybe I can do both. I have the code it's just not in junit form. Stay tuned!

        Show
        Jeff Bean added a comment - Ok. That's different from testing for regressions but maybe I can do both. I have the code it's just not in junit form. Stay tuned!
        Hide
        Todd Lipcon added a comment -

        The point isn't to test that this code didn't introduce a regression - the point is to add a regression test for this bug so that future code doesn't regress this fix

        Show
        Todd Lipcon added a comment - The point isn't to test that this code didn't introduce a regression - the point is to add a regression test for this bug so that future code doesn't regress this fix
        Hide
        Jeff Bean added a comment -

        Unit test included. Unit test found typo, which was fixed ("assign map assign reduce, whatever")

        Show
        Jeff Bean added a comment - Unit test included. Unit test found typo, which was fixed ("assign map assign reduce, whatever")
        Hide
        Jeff Bean added a comment -

        Unit test failure exposes the issue. When assignmultiple is true, a load manager might be asked to assign 3 maps in a loop, and it allows all of them.

        Show
        Jeff Bean added a comment - Unit test failure exposes the issue. When assignmultiple is true, a load manager might be asked to assign 3 maps in a loop, and it allows all of them.
        Hide
        Todd Lipcon added a comment -

        The patch seems to reformat a bunch of stuff to 4-space indentation instead of 2-space, making it tough to review. Since I've already made you do several iterations, let me take care of the next one for you... will upload an updated patch soon.

        Show
        Todd Lipcon added a comment - The patch seems to reformat a bunch of stuff to 4-space indentation instead of 2-space, making it tough to review. Since I've already made you do several iterations, let me take care of the next one for you... will upload an updated patch soon.
        Hide
        Jeff Bean added a comment -

        Oh I'm happy to do it. Is two-space indentation an apache or hadoop standard?

        Show
        Jeff Bean added a comment - Oh I'm happy to do it. Is two-space indentation an apache or hadoop standard?
        Hide
        Todd Lipcon added a comment -

        How does this patch look, Jeff? I took a slightly different approach and simply passed the number of already-assigned tasks in as part of the API. It breaks the LoadManager API, but in a very simple way, and it's only very rarely extended (and by expert users).

        This made the changes in CapBasedLoadManager much simpler.

        If you're able to perform the manual sleep-job test to verify this, that would be awesome.

        Show
        Todd Lipcon added a comment - How does this patch look, Jeff? I took a slightly different approach and simply passed the number of already-assigned tasks in as part of the API. It breaks the LoadManager API, but in a very simple way, and it's only very rarely extended (and by expert users). This made the changes in CapBasedLoadManager much simpler. If you're able to perform the manual sleep-job test to verify this, that would be awesome.
        Hide
        Jeff Bean added a comment -

        Hi Todd, I considered your patch but thought it was illegal to break an API.

        Anyway, I tested your patch and it's adequate. On a 5 node cluster with 7 slots per node, I run the test:

        hadoop jar /usr/lib/hadoop/hadoop-examples.jar sleep -m 12 -mt 300000

        your patch divvies up the tasks 3, 3, 3, 3, 0.

        My patch divvies up the tasks 2, 2, 2, 2, 2.

        Mine's a little better, but I'm not complaining: without either patch tasks are distributed as follows:

        7, 5, 0, 0, 0.

        Show
        Jeff Bean added a comment - Hi Todd, I considered your patch but thought it was illegal to break an API. Anyway, I tested your patch and it's adequate. On a 5 node cluster with 7 slots per node, I run the test: hadoop jar /usr/lib/hadoop/hadoop-examples.jar sleep -m 12 -mt 300000 your patch divvies up the tasks 3, 3, 3, 3, 0. My patch divvies up the tasks 2, 2, 2, 2, 2. Mine's a little better, but I'm not complaining: without either patch tasks are distributed as follows: 7, 5, 0, 0, 0.
        Hide
        Matei Zaharia added a comment -

        Sorry for taking a bit of time to get to this, but I agree with Todd that the approach where we change the LoadManager API a little is better. LoadManager was defined as an extension point but it's a pretty advanced feature that I don't think anyone except Facebook has looked at changing. The benefits in understandability from Todd's approach (from not having to maintain state in the LoadManager object) outweigh the cost.

        Show
        Matei Zaharia added a comment - Sorry for taking a bit of time to get to this, but I agree with Todd that the approach where we change the LoadManager API a little is better. LoadManager was defined as an extension point but it's a pretty advanced feature that I don't think anyone except Facebook has looked at changing. The benefits in understandability from Todd's approach (from not having to maintain state in the LoadManager object) outweigh the cost.
        Hide
        Eli Collins added a comment -

        Hi Matei,
        Are you +1 on Todd's patch or just the approach, ie does the patch look good to go?

        Show
        Eli Collins added a comment - Hi Matei, Are you +1 on Todd's patch or just the approach, ie does the patch look good to go?
        Hide
        Matei Zaharia added a comment -

        Yup, the patch itself also looks good.

        Show
        Matei Zaharia added a comment - Yup, the patch itself also looks good.
        Hide
        Todd Lipcon added a comment -

        TestFairScheduler is failing with this patch - so need to either fix up the test case or figure out what bug it might have introduced.

        Show
        Todd Lipcon added a comment - TestFairScheduler is failing with this patch - so need to either fix up the test case or figure out what bug it might have introduced.
        Hide
        Jeff Bean added a comment -

        More info about the failure?

        Show
        Jeff Bean added a comment - More info about the failure?
        Hide
        Jeff Bean added a comment -

        testFairScheduler is buggy. In one instance it actually looks for the bug condition and passes it by asserting that tt1 fills to capacity on small jobs that shouldn't clump together:

        checkAssignment("tt1", "attempt_test_0001_m_000000_0 on tt1",
        "attempt_test_0001_r_000000_0 on tt1",
        "attempt_test_0002_m_000000_0 on tt1",
        "attempt_test_0002_r_000000_0 on tt1");
        checkAssignment("tt2", "attempt_test_0001_m_000001_0 on tt2",
        "attempt_test_0002_r_000001_0 on tt2");

        If tt1 has a capacity of 2 mappers and 2 reducers, I expect 1 mapper and 1 reducer to be assigned each as to assign all 4 slots would be the clumping condition that this patch is trying to address.

        On every instance of checkAssignment that gets multiple tasks, the test fails on the length of the task list. The assignMultiple tests are broken for the reason I describe above. The other tests that fail are the testDelaySchedulingAtNodeLevel, testDelaySchedulingAtRackLevel, and testDelaySchedulingOffRack. In all three of those cases a task list length of 2 of the same type of is expected and a task list length of 1 is returned. Again, with a default capacity of 2 slots per task type, we'd be over capacity and actually looking for the condition this patch fixes.

        Relevant output from failure:

        Testcase: testSmallJobsWithAssignMultiple took 0.58 sec
        FAILED
        expected:<4> but was:<2>
        junit.framework.AssertionFailedError: expected:<4> but was:<2>
        at org.apache.hadoop.mapred.TestFairScheduler.checkAssignment(TestFairScheduler.java:2810)
        at org.apache.hadoop.mapred.TestFairScheduler.testSmallJobsWithAssignMultiple(TestFairScheduler.java:784)

        Testcase: testLargeJobs took 0.432 sec
        Testcase: testLargeJobsWithAssignMultiple took 0.435 sec
        FAILED
        expected:<4> but was:<2>
        junit.framework.AssertionFailedError: expected:<4> but was:<2>
        at org.apache.hadoop.mapred.TestFairScheduler.checkAssignment(TestFairScheduler.java:2810)
        at org.apache.hadoop.mapred.TestFairScheduler.testLargeJobsWithAssignMultiple(TestFairScheduler.java:954)

        Testcase: testJobsWithPriorities took 0.432 sec
        Testcase: testLargeJobsWithPools took 0.672 sec
        Testcase: testLargeJobsWithExcessCapacity took 0.598 sec
        Testcase: testLargeJobsWithExcessCapacityAndAssignMultiple took 0.594 sec
        FAILED
        expected:<4> but was:<2>
        junit.framework.AssertionFailedError: expected:<4> but was:<2>
        at org.apache.hadoop.mapred.TestFairScheduler.checkAssignment(TestFairScheduler.java:2810)
        at org.apache.hadoop.mapred.TestFairScheduler.testLargeJobsWithExcessCapacityAndAssignMultiple(TestFairScheduler.java:1301)

        Testcase: testSmallJobInLargePool took 0.517 sec
        Testcase: testPoolMaxJobs took 0.842 sec
        Testcase: testUserMaxJobs took 0.73 sec
        Testcase: testComplexJobLimits took 2.053 sec
        Testcase: testSizeBasedWeight took 0.387 sec
        Testcase: testPoolWeights took 0.851 sec
        Testcase: testPoolWeightsWhenNoMaps took 0.637 sec
        Testcase: testPoolMaxMapsReduces took 0.404 sec
        Testcase: testCapBasedLoadManager took 0.119 sec
        Testcase: testMinSharePreemption took 0.487 sec
        Testcase: testMinSharePreemptionWithSmallJob took 0.426 sec
        Testcase: testFairSharePreemption took 0.612 sec
        Testcase: testFairSharePreemptionFromMultiplePools took 0.602 sec
        Testcase: testMinAndFairSharePreemption took 0.42 sec
        Testcase: testNoPreemptionIfDisabled took 0.457 sec
        Testcase: testNoPreemptionIfOnlyLogging took 0.458 sec
        Testcase: testDelaySchedulingAtNodeLevel took 0.366 sec
        FAILED
        expected:<2> but was:<1>
        junit.framework.AssertionFailedError: expected:<2> but was:<1>
        at org.apache.hadoop.mapred.TestFairScheduler.checkAssignment(TestFairScheduler.java:2810)
        at org.apache.hadoop.mapred.TestFairScheduler.testDelaySchedulingAtNodeLevel(TestFairScheduler.java:2301)

        Testcase: testDelaySchedulingAtRackLevel took 0.521 sec
        FAILED
        expected:<2> but was:<1>
        junit.framework.AssertionFailedError: expected:<2> but was:<1>
        at org.apache.hadoop.mapred.TestFairScheduler.checkAssignment(TestFairScheduler.java:2810)
        at org.apache.hadoop.mapred.TestFairScheduler.testDelaySchedulingAtRackLevel(TestFairScheduler.java:2349)

        Testcase: testDelaySchedulingOffRack took 0.261 sec
        FAILED
        expected:<2> but was:<1>
        junit.framework.AssertionFailedError: expected:<2> but was:<1>

        This is every time checkAssignment() gets called with more than one task as a parameter. This is a buggy test. For example, in one instance it actuall

        Show
        Jeff Bean added a comment - testFairScheduler is buggy. In one instance it actually looks for the bug condition and passes it by asserting that tt1 fills to capacity on small jobs that shouldn't clump together: checkAssignment("tt1", "attempt_test_0001_m_000000_0 on tt1", "attempt_test_0001_r_000000_0 on tt1", "attempt_test_0002_m_000000_0 on tt1", "attempt_test_0002_r_000000_0 on tt1"); checkAssignment("tt2", "attempt_test_0001_m_000001_0 on tt2", "attempt_test_0002_r_000001_0 on tt2"); If tt1 has a capacity of 2 mappers and 2 reducers, I expect 1 mapper and 1 reducer to be assigned each as to assign all 4 slots would be the clumping condition that this patch is trying to address. On every instance of checkAssignment that gets multiple tasks, the test fails on the length of the task list. The assignMultiple tests are broken for the reason I describe above. The other tests that fail are the testDelaySchedulingAtNodeLevel, testDelaySchedulingAtRackLevel, and testDelaySchedulingOffRack. In all three of those cases a task list length of 2 of the same type of is expected and a task list length of 1 is returned. Again, with a default capacity of 2 slots per task type, we'd be over capacity and actually looking for the condition this patch fixes. Relevant output from failure: Testcase: testSmallJobsWithAssignMultiple took 0.58 sec FAILED expected:<4> but was:<2> junit.framework.AssertionFailedError: expected:<4> but was:<2> at org.apache.hadoop.mapred.TestFairScheduler.checkAssignment(TestFairScheduler.java:2810) at org.apache.hadoop.mapred.TestFairScheduler.testSmallJobsWithAssignMultiple(TestFairScheduler.java:784) Testcase: testLargeJobs took 0.432 sec Testcase: testLargeJobsWithAssignMultiple took 0.435 sec FAILED expected:<4> but was:<2> junit.framework.AssertionFailedError: expected:<4> but was:<2> at org.apache.hadoop.mapred.TestFairScheduler.checkAssignment(TestFairScheduler.java:2810) at org.apache.hadoop.mapred.TestFairScheduler.testLargeJobsWithAssignMultiple(TestFairScheduler.java:954) Testcase: testJobsWithPriorities took 0.432 sec Testcase: testLargeJobsWithPools took 0.672 sec Testcase: testLargeJobsWithExcessCapacity took 0.598 sec Testcase: testLargeJobsWithExcessCapacityAndAssignMultiple took 0.594 sec FAILED expected:<4> but was:<2> junit.framework.AssertionFailedError: expected:<4> but was:<2> at org.apache.hadoop.mapred.TestFairScheduler.checkAssignment(TestFairScheduler.java:2810) at org.apache.hadoop.mapred.TestFairScheduler.testLargeJobsWithExcessCapacityAndAssignMultiple(TestFairScheduler.java:1301) Testcase: testSmallJobInLargePool took 0.517 sec Testcase: testPoolMaxJobs took 0.842 sec Testcase: testUserMaxJobs took 0.73 sec Testcase: testComplexJobLimits took 2.053 sec Testcase: testSizeBasedWeight took 0.387 sec Testcase: testPoolWeights took 0.851 sec Testcase: testPoolWeightsWhenNoMaps took 0.637 sec Testcase: testPoolMaxMapsReduces took 0.404 sec Testcase: testCapBasedLoadManager took 0.119 sec Testcase: testMinSharePreemption took 0.487 sec Testcase: testMinSharePreemptionWithSmallJob took 0.426 sec Testcase: testFairSharePreemption took 0.612 sec Testcase: testFairSharePreemptionFromMultiplePools took 0.602 sec Testcase: testMinAndFairSharePreemption took 0.42 sec Testcase: testNoPreemptionIfDisabled took 0.457 sec Testcase: testNoPreemptionIfOnlyLogging took 0.458 sec Testcase: testDelaySchedulingAtNodeLevel took 0.366 sec FAILED expected:<2> but was:<1> junit.framework.AssertionFailedError: expected:<2> but was:<1> at org.apache.hadoop.mapred.TestFairScheduler.checkAssignment(TestFairScheduler.java:2810) at org.apache.hadoop.mapred.TestFairScheduler.testDelaySchedulingAtNodeLevel(TestFairScheduler.java:2301) Testcase: testDelaySchedulingAtRackLevel took 0.521 sec FAILED expected:<2> but was:<1> junit.framework.AssertionFailedError: expected:<2> but was:<1> at org.apache.hadoop.mapred.TestFairScheduler.checkAssignment(TestFairScheduler.java:2810) at org.apache.hadoop.mapred.TestFairScheduler.testDelaySchedulingAtRackLevel(TestFairScheduler.java:2349) Testcase: testDelaySchedulingOffRack took 0.261 sec FAILED expected:<2> but was:<1> junit.framework.AssertionFailedError: expected:<2> but was:<1> This is every time checkAssignment() gets called with more than one task as a parameter. This is a buggy test. For example, in one instance it actuall
        Hide
        Jeff Bean added a comment -

        (strike the last incomplete sentence of the last comment.)

        Show
        Jeff Bean added a comment - (strike the last incomplete sentence of the last comment.)
        Hide
        Jeff Bean added a comment -

        trying to think about how to fix the test, though. can we increase the slot count per tracker of the test cluster so as not to hit the clumping? Or maybe just comment out those particular tests?

        Show
        Jeff Bean added a comment - trying to think about how to fix the test, though. can we increase the slot count per tracker of the test cluster so as not to hit the clumping? Or maybe just comment out those particular tests?
        Hide
        Todd Lipcon added a comment -

        Hey Jeff - hope you didnt spend much time on it. After my earlier comment I looked into it and came to similar conclusions... i'll post a new patch here soon.

        Show
        Todd Lipcon added a comment - Hey Jeff - hope you didnt spend much time on it. After my earlier comment I looked into it and came to similar conclusions... i'll post a new patch here soon.
        Hide
        Jeff Bean added a comment -

        just a bunch of tuesday. no big deal.

        Show
        Jeff Bean added a comment - just a bunch of tuesday. no big deal.
        Hide
        Todd Lipcon added a comment -

        So, I disagree with some of what you said above. The test was broken, but not quite in the way you described.

        The issue is that the test was asserting correct behavior, but the mocking didn't accurately reflect the way the true scheduler interacts with JobInProgress, etc. In the mocks, as soon as "obtainNewMapTask" was called, the new task was inserted into the TaskTrackerStatus's taskReports structure, so that the "countMapTasks" and "countReduceTasks" functions included the just-scheduled tasks. So, the old code in LoadManager actually did the right thing as far as the test/mock setup was concerned.

        Once we fixed the LoadManager to work correctly with the real code (which doesn't insert anything into TaskTrackerStatus when the tasks are allocated), it ended up basically double-counting each assigned task when running against the mocks. So, only half as many tasks were scheduled as were supposed to.

        The fix was to change the mock to obtain all of the scheduled tasks, and only then add them to the task report structure.

        I also had to change the code in the assignment loop to add mapsAssigned and reducesAssigned around line 476 of FairScheduler.java. Otherwise the "flip flopping" back and forth between map and reduce task assignment broke.

        Show
        Todd Lipcon added a comment - So, I disagree with some of what you said above. The test was broken, but not quite in the way you described. The issue is that the test was asserting correct behavior, but the mocking didn't accurately reflect the way the true scheduler interacts with JobInProgress, etc. In the mocks, as soon as "obtainNewMapTask" was called, the new task was inserted into the TaskTrackerStatus's taskReports structure, so that the "countMapTasks" and "countReduceTasks" functions included the just-scheduled tasks. So, the old code in LoadManager actually did the right thing as far as the test/mock setup was concerned. Once we fixed the LoadManager to work correctly with the real code (which doesn't insert anything into TaskTrackerStatus when the tasks are allocated), it ended up basically double-counting each assigned task when running against the mocks. So, only half as many tasks were scheduled as were supposed to. The fix was to change the mock to obtain all of the scheduled tasks, and only then add them to the task report structure. I also had to change the code in the assignment loop to add mapsAssigned and reducesAssigned around line 476 of FairScheduler.java. Otherwise the "flip flopping" back and forth between map and reduce task assignment broke.
        Hide
        Todd Lipcon added a comment -

        Updated patch against 0.20-security. Matei: if you wouldn't mind taking a quick look at the new rev, I'd appreciate it.

        This version passes all the fairScheduler tests.

        Show
        Todd Lipcon added a comment - Updated patch against 0.20-security. Matei: if you wouldn't mind taking a quick look at the new rev, I'd appreciate it. This version passes all the fairScheduler tests.
        Hide
        Matei Zaharia added a comment -

        The patch looks good, except that there's an extra space on a line in TestFairScheduler:

             List<Task> tasks = scheduler.assignTasks(tracker(taskTrackerName))
        

        Other than that, I think it's good to go.

        Show
        Matei Zaharia added a comment - The patch looks good, except that there's an extra space on a line in TestFairScheduler: List<Task> tasks = scheduler.assignTasks(tracker(taskTrackerName)) Other than that, I think it's good to go.
        Hide
        Todd Lipcon added a comment -

        Thanks Matei. Will fix that on commit.

        Show
        Todd Lipcon added a comment - Thanks Matei. Will fix that on commit.
        Hide
        Todd Lipcon added a comment -

        Committed to 0.20-security.

        Show
        Todd Lipcon added a comment - Committed to 0.20-security.
        Hide
        Matt Foley added a comment -

        Closed upon release of Hadoop-1.1.0.

        Show
        Matt Foley added a comment - Closed upon release of Hadoop-1.1.0.

          People

          • Assignee:
            Jeff Bean
            Reporter:
            Jeff Bean
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development