Hadoop Common
  1. Hadoop Common
  2. HADOOP-5269

TaskTracker.runningTasks holding FAILED_UNCLEAN and KILLED_UNCLEAN taskStatuses forever in some cases.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.19.1
    • Fix Version/s: 0.19.2, 0.20.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Tasktracker is holdingup TaskStatus objects in runningTasks forever in somecases. This happens in the following scenario.
      -> Task got an exception
      -> Sets the phase to CLEANUP
      -> The task tries to do cleanup. and it doesn't respond after that.
      -> TaskTracker marks the task unresponsive and makes the task FAILED_UNCLEAN
      -> TaskTracker doesn't remove it from runningTasks data structure, since phase is CLEANUP and state is FAILED_UNCLEAN (it treats this as cleanupAttempt).

      I would propose that once the task goes to CLEANUP phase, kill on the task should mark it a clean failure i.e. The task state should be FAILED/KILLED.

      1. patch-5269.txt
        11 kB
        Amareshwari Sriramadasu
      2. patch-5269-0.19-0.20.txt
        10 kB
        Amareshwari Sriramadasu

        Activity

        Hide
        Amareshwari Sriramadasu added a comment -

        Attaching patch with the fix.

        Show
        Amareshwari Sriramadasu added a comment - Attaching patch with the fix.
        Hide
        Amareshwari Sriramadasu added a comment -

        test-patch result :

         
             [exec]
             [exec]
             [exec] +1 overall.
             [exec]
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec]
             [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
             [exec]
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec]
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec]
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
             [exec]
             [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
             [exec]
             [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
             [exec]
        
        Show
        Amareshwari Sriramadasu added a comment - test-patch result : [exec] [exec] [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec]
        Hide
        Amareshwari Sriramadasu added a comment -

        ant tests passed on my machine.
        Ran Reliability test and Sort benchmark.
        Also verified OutofMemory run on which Vinod saw this issue.

        Show
        Amareshwari Sriramadasu added a comment - ant tests passed on my machine. Ran Reliability test and Sort benchmark. Also verified OutofMemory run on which Vinod saw this issue.
        Hide
        Amareshwari Sriramadasu added a comment -

        Patch for 0.19 and 0.20

        Show
        Amareshwari Sriramadasu added a comment - Patch for 0.19 and 0.20
        Hide
        Amareshwari Sriramadasu added a comment -

        Patch for 0.19 and 0.20. Earlier patch had an unnecessary comment... removed that.

        Show
        Amareshwari Sriramadasu added a comment - Patch for 0.19 and 0.20. Earlier patch had an unnecessary comment... removed that.
        Hide
        Devaraj Das added a comment -

        I just committed this to the 0.20 branch and trunk. Thanks, Amareshwari! (After 0.19.1 is released for which voting is going on, we should commit this to 0.19 branch as well)

        Show
        Devaraj Das added a comment - I just committed this to the 0.20 branch and trunk. Thanks, Amareshwari! (After 0.19.1 is released for which voting is going on, we should commit this to 0.19 branch as well)
        Hide
        Hudson added a comment -
        Show
        Hudson added a comment - Integrated in Hadoop-trunk #763 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/763/ )
        Hide
        Devaraj Das added a comment -

        I committed this to the 0.19 branch.

        Show
        Devaraj Das added a comment - I committed this to the 0.19 branch.

          People

          • Assignee:
            Amareshwari Sriramadasu
            Reporter:
            Amareshwari Sriramadasu
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development