Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1306

[MUMAK] Randomize the arrival of heartbeat responses

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0, 0.22.0
    • Fix Version/s: 0.21.0
    • Component/s: contrib/mumak
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      [MUMAK] Randomize the arrival of heartbeat responses

      Description

      We propose to make the following changes to mumak, MAPREDUCE-728

      • make the timing of heartbeat responses more realistic by adding an option to randomly perturb them
      • randomize the startup time of task trackers in a fixed interval
      • remove 2 magic constants from SimulatorEngine and make sure that the first job is submitted only after the entire cluster is up and running
      1. MAPREDUCE-1306-20100308-hong.patch
        40 kB
        Hong Tang
      2. MAPREDUCE-1306-20100308.patch
        38 kB
        Tamas Sarlos
      3. MAPREDUCE-1306-20100108.patch
        16 kB
        Tamas Sarlos

        Activity

        Hide
        Tamas Sarlos added a comment -

        Attaching patch implementing the proposed improvements.

        Show
        Tamas Sarlos added a comment - Attaching patch implementing the proposed improvements.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12429734/MAPREDUCE-1306-20100108.patch
        against trunk revision 897118.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/256/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/256/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/256/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/256/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12429734/MAPREDUCE-1306-20100108.patch against trunk revision 897118. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/256/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/256/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/256/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/256/console This message is automatically generated.
        Hide
        Hong Tang added a comment -

        Patch looks good. +1.

        Show
        Hong Tang added a comment - Patch looks good. +1.
        Hide
        Chris Douglas added a comment -

        It would help if the SimulatedTaskTracker RNG were seeded deterministically from a configurable source. If the STT cstr took a seed param, a single source of randomness could seed each STT and allow for an exact replay.

        Otherwise +1

        Show
        Chris Douglas added a comment - It would help if the SimulatedTaskTracker RNG were seeded deterministically from a configurable source. If the STT cstr took a seed param, a single source of randomness could seed each STT and allow for an exact replay. Otherwise +1
        Hide
        Tamas Sarlos added a comment -

        Re: Chris: random seed
        I agree, i just consciously followed the (wrong) convention that mumak has no random seed config option. This also needs to be fixed in SimulatorEngine.java and in org.apache.hadoop.tools.

        {ZombieJob, ZombieJobProducer}

        .java (see getNextJob()) to make the entire simulation replayable. I suggest to change all these together in a separate patch. Do you agree?

        Show
        Tamas Sarlos added a comment - Re: Chris: random seed I agree, i just consciously followed the (wrong) convention that mumak has no random seed config option. This also needs to be fixed in SimulatorEngine.java and in org.apache.hadoop.tools. {ZombieJob, ZombieJobProducer} .java (see getNextJob()) to make the entire simulation replayable. I suggest to change all these together in a separate patch. Do you agree?
        Hide
        Hong Tang added a comment -

        +1 on Chris's suggestion. Re-playability is important.

        @tamas, The extra work in ZombieJob and ZombieJobProducer seems quite minor. I really hope we could just get it done here instead of having another jira to address it.

        Show
        Hong Tang added a comment - +1 on Chris's suggestion. Re-playability is important. @tamas, The extra work in ZombieJob and ZombieJobProducer seems quite minor. I really hope we could just get it done here instead of having another jira to address it.
        Hide
        Tamas Sarlos added a comment -

        Attaching the updated patch that implements the random seeding option for mumak and rumen.

        In order to make the simulation deterministic HashSets and HashMaps need to be replaced with different collection classes since they make no guarantees as to the order of iteration. E.g. JobInProgress iterates over the JobTrackers's nodesAtMaxLevel HashSet, the order of this iteration influences the scheduling of non-local maps. Using AspectJ all HashSets and HashMaps are replaced with LinkedHashSets and LinkedHashMaps, whose iteration order is determined by the order of insertions. This solution needs to be revisited if mumak ever becomes multi-threaded. An added functional test verifies deterministic replay by comparing the job history files.

        Show
        Tamas Sarlos added a comment - Attaching the updated patch that implements the random seeding option for mumak and rumen. In order to make the simulation deterministic HashSets and HashMaps need to be replaced with different collection classes since they make no guarantees as to the order of iteration. E.g. JobInProgress iterates over the JobTrackers's nodesAtMaxLevel HashSet, the order of this iteration influences the scheduling of non-local maps. Using AspectJ all HashSets and HashMaps are replaced with LinkedHashSets and LinkedHashMaps, whose iteration order is determined by the order of insertions. This solution needs to be revisited if mumak ever becomes multi-threaded. An added functional test verifies deterministic replay by comparing the job history files.
        Hide
        Tamas Sarlos added a comment -

        Submitting patch attached earlier, it passes all test-patch tests on my dev machine.

        Show
        Tamas Sarlos added a comment - Submitting patch attached earlier, it passes all test-patch tests on my dev machine.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12438267/MAPREDUCE-1306-20100308.patch
        against trunk revision 920250.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 10 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/26/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/26/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/26/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/26/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12438267/MAPREDUCE-1306-20100308.patch against trunk revision 920250. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 10 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/26/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/26/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/26/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/26/console This message is automatically generated.
        Hide
        Tamas Sarlos added a comment -

        Canceling patch so that I can resubmit it

        Show
        Tamas Sarlos added a comment - Canceling patch so that I can resubmit it
        Hide
        Tamas Sarlos added a comment -

        Resubmitting latest patch to Hudson as failing test runs fine locally.

        Show
        Tamas Sarlos added a comment - Resubmitting latest patch to Hudson as failing test runs fine locally.
        Hide
        Hong Tang added a comment -

        I think we should avoid using conf objects to pass the seeds from SimulatorEngine to ZombieJobStoryProducer. This creates a hidden dependency from SimulatorEngine to ZJSP, and thus is harder to maintain in the long run. I revised your patch such that SE passes a seed to SimulatorJobStoryProducer (SJSP) which in turn passes down to ZJSP.

        Show
        Hong Tang added a comment - I think we should avoid using conf objects to pass the seeds from SimulatorEngine to ZombieJobStoryProducer. This creates a hidden dependency from SimulatorEngine to ZJSP, and thus is harder to maintain in the long run. I revised your patch such that SE passes a seed to SimulatorJobStoryProducer (SJSP) which in turn passes down to ZJSP.
        Hide
        Hong Tang added a comment -

        Forgot to say that the patch is otherwise +1 from me.

        Show
        Hong Tang added a comment - Forgot to say that the patch is otherwise +1 from me.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12438267/MAPREDUCE-1306-20100308.patch
        against trunk revision 921230.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 10 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/515/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/515/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/515/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/515/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12438267/MAPREDUCE-1306-20100308.patch against trunk revision 921230. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 10 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/515/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/515/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/515/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/515/console This message is automatically generated.
        Hide
        Tamas Sarlos added a comment -

        +1 for Hong's change, it is clearly better that way.

        Show
        Tamas Sarlos added a comment - +1 for Hong's change, it is clearly better that way.
        Hide
        Chris Douglas added a comment -

        +1

        I committed this. Thanks, Tamas!

        Show
        Chris Douglas added a comment - +1 I committed this. Thanks, Tamas!
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #272 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/272/)
        . Randomize the arrival of heartbeat responses in Mumak.
        Contributed by Tamas Sarlos

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #272 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/272/ ) . Randomize the arrival of heartbeat responses in Mumak. Contributed by Tamas Sarlos
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #255 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/255/)
        . Randomize the arrival of heartbeat responses in Mumak.
        Contributed by Tamas Sarlos

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #255 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/255/ ) . Randomize the arrival of heartbeat responses in Mumak. Contributed by Tamas Sarlos

          People

          • Assignee:
            Tamas Sarlos
            Reporter:
            Tamas Sarlos
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development