Hadoop Common
  1. Hadoop Common
  2. HADOOP-5784

The length of the heartbeat cycle should be configurable.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Release Note:
      Hide
      Introduced a configuration parameter, mapred.heartbeats.in.second, as an expert option, that defines how many heartbeats a jobtracker can process in a second. Administrators can set this to an appropriate value based on cluster size and expected processing time on the jobtracker to achieve a balance between jobtracker scalability and latency of jobs.
      Show
      Introduced a configuration parameter, mapred.heartbeats.in.second, as an expert option, that defines how many heartbeats a jobtracker can process in a second. Administrators can set this to an appropriate value based on cluster size and expected processing time on the jobtracker to achieve a balance between jobtracker scalability and latency of jobs.

      Description

      Currently, the hearbeat cycle is set to (# nodes / 100) in seconds. This can be too long for clusters that need to run low latency jobs. We should make the number of heartbeats that should arrive a second configurable.

      1. patch-5784.txt
        3 kB
        Amareshwari Sriramadasu
      2. patch-5784-1.txt
        6 kB
        Amareshwari Sriramadasu
      3. HADOOP-5784_yhadoop20.patch
        6 kB
        Arun C Murthy

        Activity

        Tom White made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Hemanth Yamijala made changes -
        Release Note Introduced a configuration parameter, mapred.heartbeats.in.second, as an expert option, that defines how many heartbeats a jobtracker can process in a second. Administrators can set this to an appropriate value based on cluster size and expected processing time on the jobtracker to achieve a balance between jobtracker scalability and latency of jobs.
        Hide
        Devaraj Das added a comment -

        +1 on the 0.20 patch

        Show
        Devaraj Das added a comment - +1 on the 0.20 patch
        Arun C Murthy made changes -
        Attachment HADOOP-5784_yhadoop20.patch [ 12420257 ]
        Hide
        Arun C Murthy added a comment -

        Patch for the yahoo hadoop-0.20 distribution.

        Show
        Arun C Murthy added a comment - Patch for the yahoo hadoop-0.20 distribution.
        Owen O'Malley made changes -
        Component/s mapred [ 12310690 ]
        Devaraj Das made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Devaraj Das added a comment -

        I just committed this. Thanks, Amareshwari!

        Show
        Devaraj Das added a comment - I just committed this. Thanks, Amareshwari!
        Hide
        Amareshwari Sriramadasu added a comment -

        Test failures are not related to the patch. All tests passed on my machine

        Show
        Amareshwari Sriramadasu added a comment - Test failures are not related to the patch. All tests passed on my machine
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12409148/patch-5784-1.txt
        against trunk revision 779944.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12409148/patch-5784-1.txt against trunk revision 779944. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/console This message is automatically generated.
        Amareshwari Sriramadasu made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Amareshwari Sriramadasu added a comment -

        test-patch result:

             [exec]
             [exec] +1 overall.
             [exec]
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec]
             [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
             [exec]
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec]
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec]
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
             [exec]
             [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
             [exec]
             [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
             [exec]
        

        ant test passed on my machine

        Show
        Amareshwari Sriramadasu added a comment - test-patch result: [exec] [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] ant test passed on my machine
        Amareshwari Sriramadasu made changes -
        Attachment patch-5784-1.txt [ 12409148 ]
        Hide
        Amareshwari Sriramadasu added a comment -

        Patch updated with testcase.

        Show
        Amareshwari Sriramadasu added a comment - Patch updated with testcase.
        Amareshwari Sriramadasu made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12408937/patch-5784.txt
        against trunk revision 778994.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        -1 release audit. The applied patch generated 493 release audit warnings (more than the trunk's current 492 warnings).

        -1 core tests. The patch failed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/testReport/
        Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/artifact/trunk/current/releaseAuditDiffWarnings.txt
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12408937/patch-5784.txt against trunk revision 778994. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. -1 release audit. The applied patch generated 493 release audit warnings (more than the trunk's current 492 warnings). -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/artifact/trunk/current/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/console This message is automatically generated.
        Hide
        Owen O'Malley added a comment -

        This looks good, but I wish there was a good way to set up a test case. I guess the best way would be to create a JobTracker and call the heartbeat method and observe the requested heartbeat interval.

        Show
        Owen O'Malley added a comment - This looks good, but I wish there was a good way to set up a test case. I guess the best way would be to create a JobTracker and call the heartbeat method and observe the requested heartbeat interval.
        Hide
        Amar Kamat added a comment -

        Wondering which one is more intuitive, number-of-heartbeats-per-sec or heartbeat-interval. The title says heartbeat-interval should be configurable whereas the description states number-of-heartbeats-per-sec should be configurable. I personally think heartbeat-interval is easier to set and play around. Thoughts?

        Regarding the test case, cant we spoof tasktracker status and invoke JobTracker.heartbeat() ? This way we can increment the tracker count and query the jobtracker for the current heartbeat interval? Thoughts?

        Show
        Amar Kamat added a comment - Wondering which one is more intuitive, number-of-heartbeats-per-sec or heartbeat-interval . The title says heartbeat-interval should be configurable whereas the description states number-of-heartbeats-per-sec should be configurable. I personally think heartbeat-interval is easier to set and play around. Thoughts? Regarding the test case, cant we spoof tasktracker status and invoke JobTracker.heartbeat() ? This way we can increment the tracker count and query the jobtracker for the current heartbeat interval? Thoughts?
        Amareshwari Sriramadasu made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Fix Version/s 0.21.0 [ 12313563 ]
        Hide
        Amareshwari Sriramadasu added a comment -

        test-patch result:

             [exec] -1 overall.
             [exec]
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec]
             [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
             [exec]                         Please justify why no tests are needed for this patch.
             [exec]
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec]
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec]
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
             [exec]
             [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
             [exec]
             [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
        

        It is difficult to write unit test for this.
        Tested the patch by running sort on 500 nodes with mapred.heartbeats.in.second=200.

        Show
        Amareshwari Sriramadasu added a comment - test-patch result: [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. It is difficult to write unit test for this. Tested the patch by running sort on 500 nodes with mapred.heartbeats.in.second=200.
        Amareshwari Sriramadasu made changes -
        Attachment patch-5784.txt [ 12408937 ]
        Hide
        Amareshwari Sriramadasu added a comment -

        Patch making the number of heartbeats that arrive JobTracker configurable.

        Show
        Amareshwari Sriramadasu added a comment - Patch making the number of heartbeats that arrive JobTracker configurable.
        Amareshwari Sriramadasu made changes -
        Field Original Value New Value
        Assignee Amareshwari Sriramadasu [ amareshwari ]
        Hide
        Amareshwari Sriramadasu added a comment -

        Current heartbeat interval is set to clusterSize / 100, and minimum interval is capped at 3seconds.
        It assumes that JT can process 100 heartbeats in a second. See http://issues.apache.org/jira/browse/HADOOP-1900?focusedCommentId=12542530&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12542530

        Now, if we make number of heartbeats that should arrive in a second configurable (with default value as 100) , heartbeat interval can be calculated as

         heartbeatInterval = max((clusterSize / #heartbeats in a second),  HEARTBEAT_INTERVAL_MIN) ;
        

        Thoughts?

        Show
        Amareshwari Sriramadasu added a comment - Current heartbeat interval is set to clusterSize / 100 , and minimum interval is capped at 3seconds. It assumes that JT can process 100 heartbeats in a second. See http://issues.apache.org/jira/browse/HADOOP-1900?focusedCommentId=12542530&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12542530 Now, if we make number of heartbeats that should arrive in a second configurable (with default value as 100) , heartbeat interval can be calculated as heartbeatInterval = max((clusterSize / #heartbeats in a second), HEARTBEAT_INTERVAL_MIN) ; Thoughts?
        Hide
        Steve Loughran added a comment -

        is the goal here to detect failures of data nodes, or to have an up to date track of which task trackers have capacity?

        Show
        Steve Loughran added a comment - is the goal here to detect failures of data nodes, or to have an up to date track of which task trackers have capacity?
        Owen O'Malley created issue -

          People

          • Assignee:
            Amareshwari Sriramadasu
            Reporter:
            Owen O'Malley
          • Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development