Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1265

Include tasktracker name in the task attempt error log

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Trivial Trivial
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When task attempt receive an error, TaskInProgress will log the task attempt id and diagnosis string in the JobTracker log.
      Ex:
      2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009xxxx_xxxx_r_000009_1: Error: java.lang.OutOfMemoryError: Java heap space
      2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009xxxx_xxxx_m_000478_0: Task attempt_2009xxxx_xxxx_m_000478_0 failed to report status for 601 seconds. Killing!

      When we want to debug a machine for example, a node has been blacklisted in the past few days.
      We have to use the task attempt id to find the TT. This is not very convenient.

      It will be nice if we can also log the tasktracker which causes this error.
      This way we can just grep the hostname to quickly find all the relevant error message.

      1. MAPREDUCE-1265.patch
        1 kB
        Scott Chen
      2. MAPREDUCE-1265-v2.patch
        1 kB
        Scott Chen

        Activity

        Hide
        Scott Chen added a comment -

        I just realized that job id is just part of task attempt id so we can easily obtain that.
        So we need to log tasktracker name here only.

        So, here is the log after change:
        2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009xxxx_xxxx_r_000009_1 on tracker_m01.aaa.com: Error: java.lang.OutOfMemoryError: Java heap space
        2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009xxxx_xxxx_m_000478_0 on tracker_m02.aaa.com: Task attempt_2009xxxx_xxxx_m_000478_0 failed to report status for 601 seconds. Killing!

        Show
        Scott Chen added a comment - I just realized that job id is just part of task attempt id so we can easily obtain that. So we need to log tasktracker name here only. So, here is the log after change: 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009xxxx_xxxx_r_000009_1 on tracker_m01.aaa.com : Error: java.lang.OutOfMemoryError: Java heap space 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009xxxx_xxxx_m_000478_0 on tracker_m02.aaa.com : Task attempt_2009xxxx_xxxx_m_000478_0 failed to report status for 601 seconds. Killing!
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12426933/MAPREDUCE-1265-v2.patch
        against trunk revision 887135.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/291/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/291/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/291/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/291/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426933/MAPREDUCE-1265-v2.patch against trunk revision 887135. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/291/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/291/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/291/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/291/console This message is automatically generated.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12426933/MAPREDUCE-1265-v2.patch
        against trunk revision 887844.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/298/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/298/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/298/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/298/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426933/MAPREDUCE-1265-v2.patch against trunk revision 887844. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/298/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/298/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/298/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/298/console This message is automatically generated.
        Hide
        Scott Chen added a comment -

        This one changes only contents of the log. So there is no unit test included.

        Show
        Scott Chen added a comment - This one changes only contents of the log. So there is no unit test included.
        Hide
        dhruba borthakur added a comment -

        I do not think that needs a unit test because it just changes one single LOG line (to print out the name of the node). I will commit it in a day.

        Show
        dhruba borthakur added a comment - I do not think that needs a unit test because it just changes one single LOG line (to print out the name of the node). I will commit it in a day.
        Hide
        dhruba borthakur added a comment -

        I just committed this. Thanks Scott!

        Show
        dhruba borthakur added a comment - I just committed this. Thanks Scott!
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #196 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/196/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #196 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/196/ )

          People

          • Assignee:
            Scott Chen
            Reporter:
            Scott Chen
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development