Hadoop Common
  1. Hadoop Common
  2. HADOOP-5784

The length of the heartbeat cycle should be configurable.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Release Note:
      Hide
      Introduced a configuration parameter, mapred.heartbeats.in.second, as an expert option, that defines how many heartbeats a jobtracker can process in a second. Administrators can set this to an appropriate value based on cluster size and expected processing time on the jobtracker to achieve a balance between jobtracker scalability and latency of jobs.
      Show
      Introduced a configuration parameter, mapred.heartbeats.in.second, as an expert option, that defines how many heartbeats a jobtracker can process in a second. Administrators can set this to an appropriate value based on cluster size and expected processing time on the jobtracker to achieve a balance between jobtracker scalability and latency of jobs.

      Description

      Currently, the hearbeat cycle is set to (# nodes / 100) in seconds. This can be too long for clusters that need to run low latency jobs. We should make the number of heartbeats that should arrive a second configurable.

      1. patch-5784-1.txt
        6 kB
        Amareshwari Sriramadasu
      2. patch-5784.txt
        3 kB
        Amareshwari Sriramadasu
      3. HADOOP-5784_yhadoop20.patch
        6 kB
        Arun C Murthy

        Activity

        Hide
        steve_l added a comment -

        is the goal here to detect failures of data nodes, or to have an up to date track of which task trackers have capacity?

        Show
        steve_l added a comment - is the goal here to detect failures of data nodes, or to have an up to date track of which task trackers have capacity?
        Hide
        Amareshwari Sriramadasu added a comment -

        Current heartbeat interval is set to clusterSize / 100, and minimum interval is capped at 3seconds.
        It assumes that JT can process 100 heartbeats in a second. See http://issues.apache.org/jira/browse/HADOOP-1900?focusedCommentId=12542530&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12542530

        Now, if we make number of heartbeats that should arrive in a second configurable (with default value as 100) , heartbeat interval can be calculated as

         heartbeatInterval = max((clusterSize / #heartbeats in a second),  HEARTBEAT_INTERVAL_MIN) ;
        

        Thoughts?

        Show
        Amareshwari Sriramadasu added a comment - Current heartbeat interval is set to clusterSize / 100 , and minimum interval is capped at 3seconds. It assumes that JT can process 100 heartbeats in a second. See http://issues.apache.org/jira/browse/HADOOP-1900?focusedCommentId=12542530&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12542530 Now, if we make number of heartbeats that should arrive in a second configurable (with default value as 100) , heartbeat interval can be calculated as heartbeatInterval = max((clusterSize / #heartbeats in a second), HEARTBEAT_INTERVAL_MIN) ; Thoughts?
        Hide
        Amareshwari Sriramadasu added a comment -

        Patch making the number of heartbeats that arrive JobTracker configurable.

        Show
        Amareshwari Sriramadasu added a comment - Patch making the number of heartbeats that arrive JobTracker configurable.
        Hide
        Amareshwari Sriramadasu added a comment -

        test-patch result:

             [exec] -1 overall.
             [exec]
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec]
             [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
             [exec]                         Please justify why no tests are needed for this patch.
             [exec]
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec]
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec]
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
             [exec]
             [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
             [exec]
             [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
        

        It is difficult to write unit test for this.
        Tested the patch by running sort on 500 nodes with mapred.heartbeats.in.second=200.

        Show
        Amareshwari Sriramadasu added a comment - test-patch result: [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. It is difficult to write unit test for this. Tested the patch by running sort on 500 nodes with mapred.heartbeats.in.second=200.
        Hide
        Amar Kamat added a comment -

        Wondering which one is more intuitive, number-of-heartbeats-per-sec or heartbeat-interval. The title says heartbeat-interval should be configurable whereas the description states number-of-heartbeats-per-sec should be configurable. I personally think heartbeat-interval is easier to set and play around. Thoughts?

        Regarding the test case, cant we spoof tasktracker status and invoke JobTracker.heartbeat() ? This way we can increment the tracker count and query the jobtracker for the current heartbeat interval? Thoughts?

        Show
        Amar Kamat added a comment - Wondering which one is more intuitive, number-of-heartbeats-per-sec or heartbeat-interval . The title says heartbeat-interval should be configurable whereas the description states number-of-heartbeats-per-sec should be configurable. I personally think heartbeat-interval is easier to set and play around. Thoughts? Regarding the test case, cant we spoof tasktracker status and invoke JobTracker.heartbeat() ? This way we can increment the tracker count and query the jobtracker for the current heartbeat interval? Thoughts?
        Hide
        Owen O'Malley added a comment -

        This looks good, but I wish there was a good way to set up a test case. I guess the best way would be to create a JobTracker and call the heartbeat method and observe the requested heartbeat interval.

        Show
        Owen O'Malley added a comment - This looks good, but I wish there was a good way to set up a test case. I guess the best way would be to create a JobTracker and call the heartbeat method and observe the requested heartbeat interval.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12408937/patch-5784.txt
        against trunk revision 778994.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        -1 release audit. The applied patch generated 493 release audit warnings (more than the trunk's current 492 warnings).

        -1 core tests. The patch failed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/testReport/
        Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/artifact/trunk/current/releaseAuditDiffWarnings.txt
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12408937/patch-5784.txt against trunk revision 778994. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. -1 release audit. The applied patch generated 493 release audit warnings (more than the trunk's current 492 warnings). -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/artifact/trunk/current/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/console This message is automatically generated.
        Hide
        Amareshwari Sriramadasu added a comment -

        Patch updated with testcase.

        Show
        Amareshwari Sriramadasu added a comment - Patch updated with testcase.
        Hide
        Amareshwari Sriramadasu added a comment -

        test-patch result:

             [exec]
             [exec] +1 overall.
             [exec]
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec]
             [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
             [exec]
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec]
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec]
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
             [exec]
             [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
             [exec]
             [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
             [exec]
        

        ant test passed on my machine

        Show
        Amareshwari Sriramadasu added a comment - test-patch result: [exec] [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] ant test passed on my machine
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12409148/patch-5784-1.txt
        against trunk revision 779944.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12409148/patch-5784-1.txt against trunk revision 779944. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/console This message is automatically generated.
        Hide
        Amareshwari Sriramadasu added a comment -

        Test failures are not related to the patch. All tests passed on my machine

        Show
        Amareshwari Sriramadasu added a comment - Test failures are not related to the patch. All tests passed on my machine
        Hide
        Devaraj Das added a comment -

        I just committed this. Thanks, Amareshwari!

        Show
        Devaraj Das added a comment - I just committed this. Thanks, Amareshwari!
        Hide
        Arun C Murthy added a comment -

        Patch for the yahoo hadoop-0.20 distribution.

        Show
        Arun C Murthy added a comment - Patch for the yahoo hadoop-0.20 distribution.
        Hide
        Devaraj Das added a comment -

        +1 on the 0.20 patch

        Show
        Devaraj Das added a comment - +1 on the 0.20 patch

          People

          • Assignee:
            Amareshwari Sriramadasu
            Reporter:
            Owen O'Malley
          • Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development