Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-270

TaskTracker could send an out-of-band heartbeat when the last running map/reduce completes

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Release Note:
      Introduced an option to allow tasktrackers to send an out of band heartbeat on task-completion to improve job latency. A new configuration option mapreduce.tasktracker.outofband.heartbeat is defined, which can be enabled to send this heartbeat.

      Description

      Currently the TaskTracker strictly respects the heartbeat interval, this causes utilization issues when all running tasks complete. We could send an out-of-band heartbeat in that case.

      1. MAPREDUCE-270.patch
        3 kB
        Arun C Murthy
      2. MAPREDUCE-270.patch
        4 kB
        Arun C Murthy
      3. MAPREDUCE-270.patch
        10 kB
        Arun C Murthy
      4. MAPREDUCE-270.patch
        12 kB
        Arun C Murthy
      5. MAPREDUCE-270.patch
        13 kB
        Arun C Murthy
      6. MAPREDUCE-270_yhadoop20.patch
        2 kB
        Arun C Murthy
      7. MAPREDUCE-270_yhadoop20.patch
        4 kB
        Arun C Murthy
      8. MAPREDUCE-270_yhadoop20.patch
        9 kB
        Arun C Murthy
      9. MAPREDUCE-270_yhadoop20.patch
        11 kB
        Arun C Murthy
      10. MAPREDUCE-270_yhadoop20.patch
        12 kB
        Arun C Murthy

        Activity

        Arun C Murthy created issue -
        Owen O'Malley made changes -
        Field Original Value New Value
        Project Hadoop Common [ 12310240 ] Hadoop Map/Reduce [ 12310941 ]
        Key HADOOP-5129 MAPREDUCE-270
        Affects Version/s 0.20.0 [ 12313438 ]
        Component/s mapred [ 12310690 ]
        Hide
        Arun C Murthy added a comment -

        We've seen the lack of out-of-band heartbeat cause severe latency issues for some class of small applications. I'd like to use this jira to re-introduce out-of-band heartbeats when configured for completion of any task. Thoughts?

        Show
        Arun C Murthy added a comment - We've seen the lack of out-of-band heartbeat cause severe latency issues for some class of small applications. I'd like to use this jira to re-introduce out-of-band heartbeats when configured for completion of any task. Thoughts?
        Hide
        Arun C Murthy added a comment -

        Straight-forward patches to trunk and yahoop hadoop-0.20 distribution re-introducing the out-of-band heartbeat on task completion (configurable via a secret 'mapreduce.tasktracker.oob.heartbeat' knob).

        Show
        Arun C Murthy added a comment - Straight-forward patches to trunk and yahoop hadoop-0.20 distribution re-introducing the out-of-band heartbeat on task completion (configurable via a secret 'mapreduce.tasktracker.oob.heartbeat' knob).
        Arun C Murthy made changes -
        Attachment MAPREDUCE-270.patch [ 12420259 ]
        Attachment MAPREDUCE-270_yhadoop20.patch [ 12420260 ]
        Hide
        Arun C Murthy added a comment -

        Re-worked the patch after talking to Devaraj.

        Show
        Arun C Murthy added a comment - Re-worked the patch after talking to Devaraj.
        Arun C Murthy made changes -
        Attachment MAPREDUCE-270.patch [ 12420431 ]
        Attachment MAPREDUCE-270_yhadoop20.patch [ 12420432 ]
        Hide
        Devaraj Das added a comment -

        +1 core changes look fine.

        Show
        Devaraj Das added a comment - +1 core changes look fine.
        Arun C Murthy made changes -
        Attachment MAPREDUCE-270_yhadoop20.patch [ 12420535 ]
        Hide
        Arun C Murthy added a comment -

        Completed patch alongwith testcases.

        Show
        Arun C Murthy added a comment - Completed patch alongwith testcases.
        Arun C Murthy made changes -
        Attachment MAPREDUCE-270.patch [ 12420536 ]
        Arun C Murthy made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Affects Version/s 0.21.0 [ 12314045 ]
        Fix Version/s 0.22.0 [ 12314184 ]
        Hide
        Arun C Murthy added a comment -

        Nigel - This patch proved very very hard to test without mock-objects. For now, I've attached a slightly arbitrary test-case which checks does the following:

        1. Simulates a very large cluster by setting a very high value of 30s for the heartbeat-interval between the JobTracker and TaskTracker.
        2. Switches on the out-of-band heartbeat for the cluster.
        3. Submits a very small random-writer job with 2 maps and asserts that the job completes within 120s.

        The 120s deadline is carefully chosen with the idea that a randomwriter job with 2 maps will need at least 4 heartbeats: setup-task, map_0, map_1 and cleanup-task. However this is still arbitrary and not very scientific. So, should we commit this test-case given that it is slightly flaky? Thoughts?

        PS: The job completes in ~50s with out-of-band heartbeats turned on, and in ~3mins with it turned off. FYI

        Show
        Arun C Murthy added a comment - Nigel - This patch proved very very hard to test without mock-objects. For now, I've attached a slightly arbitrary test-case which checks does the following: Simulates a very large cluster by setting a very high value of 30s for the heartbeat-interval between the JobTracker and TaskTracker. Switches on the out-of-band heartbeat for the cluster. Submits a very small random-writer job with 2 maps and asserts that the job completes within 120s. The 120s deadline is carefully chosen with the idea that a randomwriter job with 2 maps will need at least 4 heartbeats: setup-task, map_0, map_1 and cleanup-task. However this is still arbitrary and not very scientific. So, should we commit this test-case given that it is slightly flaky? Thoughts? PS: The job completes in ~50s with out-of-band heartbeats turned on, and in ~3mins with it turned off. FYI
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12420536/MAPREDUCE-270.patch
        against trunk revision 818674.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/59/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/59/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/59/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/59/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12420536/MAPREDUCE-270.patch against trunk revision 818674. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/59/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/59/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/59/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/59/console This message is automatically generated.
        Hide
        Arun C Murthy added a comment -

        Cancelling patch to incorporate Nigel's offline review comments.

        Show
        Arun C Murthy added a comment - Cancelling patch to incorporate Nigel's offline review comments.
        Arun C Murthy made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Arun C Murthy made changes -
        Attachment MAPREDUCE-270_yhadoop20.patch [ 12420674 ]
        Hide
        Arun C Murthy added a comment -

        Updated patch.

        Show
        Arun C Murthy added a comment - Updated patch.
        Arun C Murthy made changes -
        Attachment MAPREDUCE-270.patch [ 12420675 ]
        Arun C Murthy made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12420675/MAPREDUCE-270.patch
        against trunk revision 818946.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/61/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/61/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/61/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/61/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12420675/MAPREDUCE-270.patch against trunk revision 818946. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/61/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/61/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/61/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/61/console This message is automatically generated.
        Hide
        Arun C Murthy added a comment -

        The test failure (TestCopyFiles) is unrelated and is being tracked at MAPREDUCE-1029.

        Show
        Arun C Murthy added a comment - The test failure (TestCopyFiles) is unrelated and is being tracked at MAPREDUCE-1029 .
        Hide
        Arun C Murthy added a comment -

        More fit/polish comments from Nigel/Owen.

        Show
        Arun C Murthy added a comment - More fit/polish comments from Nigel/Owen.
        Arun C Murthy made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Arun C Murthy made changes -
        Attachment MAPREDUCE-270_yhadoop20.patch [ 12420718 ]
        Arun C Murthy made changes -
        Attachment MAPREDUCE-270.patch [ 12420719 ]
        Arun C Murthy made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Devaraj Das added a comment -

        +1

        Show
        Devaraj Das added a comment - +1
        Hide
        Arun C Murthy added a comment -

        I just committed this.

        Show
        Arun C Murthy added a comment - I just committed this.
        Arun C Murthy made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #69 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/69/)
        . Fix the tasktracker to optionally send an out-of-band heartbeat on task-completion for better job-latency.
        Configuration changes:
        add mapreduce.tasktracker.outofband.heartbeat

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #69 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/69/ ) . Fix the tasktracker to optionally send an out-of-band heartbeat on task-completion for better job-latency. Configuration changes: add mapreduce.tasktracker.outofband.heartbeat
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12420719/MAPREDUCE-270.patch
        against trunk revision 818946.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/136/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/136/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/136/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/136/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12420719/MAPREDUCE-270.patch against trunk revision 818946. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/136/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/136/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/136/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/136/console This message is automatically generated.
        Hemanth Yamijala made changes -
        Release Note Introduced an option to allow tasktrackers to send an out of band heartbeat on task-completion to improve job latency. A new configuration option mapreduce.tasktracker.outofband.heartbeat is defined, which can be enabled to send this heartbeat.
        Tom White made changes -
        Fix Version/s 0.22.0 [ 12314184 ]
        Fix Version/s 0.21.0 [ 12314045 ]
        Tom White made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Arun C Murthy
            Reporter:
            Arun C Murthy
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development