Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1219

JobTracker Metrics causes undue load on JobTracker

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      JobTrackerMetricsInst.doUpdates updates job-level counters of all running jobs into JobTracker's metrics causing very bad performance and hampers heartbeats. Since Job level metrics are better served by JobHistory, it may be a good idea to remove these from the metrics framework.

      1. MR-1219-2.patch
        4 kB
        Sreekanth Ramakrishnan
      2. MR-1219-1.patch
        2 kB
        Sreekanth Ramakrishnan
      3. MAPREDUCE-1219.patch
        0.6 kB
        Arun C Murthy
      4. patch-1219-ydist.txt
        0.6 kB
        Amareshwari Sriramadasu

        Activity

        Jothi Padmanabhan created issue -
        Hide
        Amareshwari Sriramadasu added a comment -

        Patch for Yahoo! distribution, removing the job updates from JobTrackerMetricsInst.doUpdates()

        Show
        Amareshwari Sriramadasu added a comment - Patch for Yahoo! distribution, removing the job updates from JobTrackerMetricsInst.doUpdates()
        Amareshwari Sriramadasu made changes -
        Field Original Value New Value
        Attachment patch-1219-ydist.txt [ 12425302 ]
        Hide
        Arun C Murthy added a comment -

        Patch for trunk.

        Show
        Arun C Murthy added a comment - Patch for trunk.
        Arun C Murthy made changes -
        Attachment MAPREDUCE-1219.patch [ 12425365 ]
        Hide
        V.Karthikeyan added a comment -

        Verified the Job metrics using FileContext property enabled.
        Ran jobs to verify the counters in sync with the
        Jobtracker UI and log file generated using FileContext enabled.

        Show
        V.Karthikeyan added a comment - Verified the Job metrics using FileContext property enabled. Ran jobs to verify the counters in sync with the Jobtracker UI and log file generated using FileContext enabled.
        Hide
        Amareshwari Sriramadasu added a comment -

        test-patch result for Y!20 patch :

             [exec] -1 overall.
             [exec]
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec]
             [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
             [exec]                         Please justify why no tests are needed for this patch.
             [exec]
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec]
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec]
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
        

        -1 tests included.

        The patch removes some existing code for which there are no unit tests.

        Show
        Amareshwari Sriramadasu added a comment - test-patch result for Y!20 patch : [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 tests included. The patch removes some existing code for which there are no unit tests.
        Hide
        Amareshwari Sriramadasu added a comment -

        All tests passed on my machine except TestHdfsProxy

        Show
        Amareshwari Sriramadasu added a comment - All tests passed on my machine except TestHdfsProxy
        Hide
        Sreekanth Ramakrishnan added a comment -

        In todays state, JobTracker publishes its metrics along with its running jobs metrics. The running jobs list can be pretty long and the metrics updating cycle is done every heartbeat. This causes a significant increase in heartbeat processing time. Also, the job level metrics are nothing other than counters of the running job. The counters of running job are obtained by locking up the job, which also does not help us in terms of performance. But looking at the information published, shouldn't jobtracker publish its own metrics and not include job level details? Also, users can obtain the counters using different API. So can we remove the job level metrics aka counters from JobTracker metrics? Thoughts?

        Show
        Sreekanth Ramakrishnan added a comment - In todays state, JobTracker publishes its metrics along with its running jobs metrics. The running jobs list can be pretty long and the metrics updating cycle is done every heartbeat. This causes a significant increase in heartbeat processing time. Also, the job level metrics are nothing other than counters of the running job. The counters of running job are obtained by locking up the job, which also does not help us in terms of performance. But looking at the information published, shouldn't jobtracker publish its own metrics and not include job level details? Also, users can obtain the counters using different API. So can we remove the job level metrics aka counters from JobTracker metrics? Thoughts?
        Hide
        Amareshwari Sriramadasu added a comment -

        shouldn't jobtracker publish its own metrics and not include job level details? Also, users can obtain the counters using different API. So can we remove the job level metrics aka counters from JobTracker metrics?

        +1 we should remove the Job level metrics from JobTracker metrics.

        Attached patch just removes the jobMetrics from doUpdates(). Shall we remove the collection part also from JobInProgress code, because we don't update anymore?

        Show
        Amareshwari Sriramadasu added a comment - shouldn't jobtracker publish its own metrics and not include job level details? Also, users can obtain the counters using different API. So can we remove the job level metrics aka counters from JobTracker metrics? +1 we should remove the Job level metrics from JobTracker metrics. Attached patch just removes the jobMetrics from doUpdates(). Shall we remove the collection part also from JobInProgress code, because we don't update anymore?
        Hide
        Sreekanth Ramakrishnan added a comment -

        Attaching patch removing the unused code in JobInProgress

        Show
        Sreekanth Ramakrishnan added a comment - Attaching patch removing the unused code in JobInProgress
        Sreekanth Ramakrishnan made changes -
        Attachment MR-1219-1.patch [ 12442501 ]
        Hide
        Sreekanth Ramakrishnan added a comment -

        Running thro' Hudson

        Show
        Sreekanth Ramakrishnan added a comment - Running thro' Hudson
        Sreekanth Ramakrishnan made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12442501/MR-1219-1.patch
        against trunk revision 936166.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/125/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/125/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/125/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/125/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12442501/MR-1219-1.patch against trunk revision 936166. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/125/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/125/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/125/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/125/console This message is automatically generated.
        Hide
        Sreekanth Ramakrishnan added a comment -
        Show
        Sreekanth Ramakrishnan added a comment - The justification is mentioned in the comment : https://issues.apache.org/jira/browse/MAPREDUCE-1219?focusedCommentId=12779940&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12779940 I just have removed unused code from JobInProgress .
        Hide
        Amareshwari Sriramadasu added a comment -

        shall we remove the member variable jobMetrics itself?

        Show
        Amareshwari Sriramadasu added a comment - shall we remove the member variable jobMetrics itself?
        Amareshwari Sriramadasu made changes -
        Assignee Amareshwari Sriramadasu [ amareshwari ] Sreekanth Ramakrishnan [ sreekanth ]
        Hide
        Sreekanth Ramakrishnan added a comment -

        Incorporating Amareshwaris comment.

        Show
        Sreekanth Ramakrishnan added a comment - Incorporating Amareshwaris comment.
        Sreekanth Ramakrishnan made changes -
        Attachment MR-1219-2.patch [ 12442659 ]
        Sreekanth Ramakrishnan made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Sreekanth Ramakrishnan made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Amareshwari Sriramadasu added a comment -

        +1 patch looks good.

        Show
        Amareshwari Sriramadasu added a comment - +1 patch looks good.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12442659/MR-1219-2.patch
        against trunk revision 937201.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/131/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/131/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/131/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/131/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12442659/MR-1219-2.patch against trunk revision 937201. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/131/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/131/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/131/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/131/console This message is automatically generated.
        Hide
        Sharad Agarwal added a comment -

        I committed this. Thanks Sreekanth.

        Show
        Sharad Agarwal added a comment - I committed this. Thanks Sreekanth.
        Sharad Agarwal made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hadoop Flags [Reviewed]
        Fix Version/s 0.22.0 [ 12314184 ]
        Resolution Fixed [ 1 ]
        Tom White made changes -
        Fix Version/s 0.21.0 [ 12314045 ]
        Fix Version/s 0.22.0 [ 12314184 ]
        Tom White made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Patch Available Patch Available Open Open
        1d 6h 16m 1 Sreekanth Ramakrishnan 23/Apr/10 09:33
        Open Open Patch Available Patch Available
        154d 21h 43m 2 Sreekanth Ramakrishnan 23/Apr/10 09:34
        Patch Available Patch Available Resolved Resolved
        2d 19h 42m 1 Sharad Agarwal 26/Apr/10 05:16
        Resolved Resolved Closed Closed
        120d 17h 2m 1 Tom White 24/Aug/10 22:19

          People

          • Assignee:
            Sreekanth Ramakrishnan
            Reporter:
            Jothi Padmanabhan
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development