Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.23.0
    • Fix Version/s: 0.23.0
    • Component/s: jobtracker
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      This test case is failing in trunk after the commit of MAPREDUCE-2207

      1. 2271-1.diff
        1.0 kB
        Liyin Liang

        Issue Links

          Activity

          Hide
          Liyin Liang added a comment -

          With MAPREDUCE-2207, a tracker can't get any task-cleanup-task, if it has tasks with FAILED_UNCLEAN state. The testNumSlotsUsedForTaskCleanup of TestSetupTaskScheduling creates a dummy tracker status with two FAILED_UNCLEAN tasks to report. So the jobtracker return null when call getSetupAndCleanupTasks with this tracker status.
          I think it's useless to add task status to the tracker status in that test case, because the job already has two task-setup-tasks to schedule and the job's two tasks's status are FAILED_UNCLEAN. In other words, the job's tasks status need not to be updated.
          So we can just remove addNewTaskStatus codes as 2271-1.diff.

          Show
          Liyin Liang added a comment - With MAPREDUCE-2207 , a tracker can't get any task-cleanup-task, if it has tasks with FAILED_UNCLEAN state. The testNumSlotsUsedForTaskCleanup of TestSetupTaskScheduling creates a dummy tracker status with two FAILED_UNCLEAN tasks to report. So the jobtracker return null when call getSetupAndCleanupTasks with this tracker status. I think it's useless to add task status to the tracker status in that test case, because the job already has two task-setup-tasks to schedule and the job's two tasks's status are FAILED_UNCLEAN . In other words, the job's tasks status need not to be updated. So we can just remove addNewTaskStatus codes as 2271-1.diff.
          Hide
          Todd Lipcon added a comment -

          Sorry if I'm being dense, but I don't quite follow the logic of this test. Could you help me understand what's supposed to be going on here?

          It seems createJob() already makes the two cleanup tasks with status as FAILED_UNCLEAN from that tracker. Is the change in MAPREDUCE-2207 only such that the cleanup-tasks won't be rescheduled in response to the same heartbeat that marked them failed? So in this test we get them scheduled because it's a new heartbeat with no task statuses, even though it previously had failed on that tracker?

          Show
          Todd Lipcon added a comment - Sorry if I'm being dense, but I don't quite follow the logic of this test. Could you help me understand what's supposed to be going on here? It seems createJob() already makes the two cleanup tasks with status as FAILED_UNCLEAN from that tracker. Is the change in MAPREDUCE-2207 only such that the cleanup-tasks won't be rescheduled in response to the same heartbeat that marked them failed? So in this test we get them scheduled because it's a new heartbeat with no task statuses, even though it previously had failed on that tracker?
          Hide
          Liyin Liang added a comment -

          Hi Todd,
          I think the test case testNumSlotsUsedForTaskCleanup is supposed to check that one task-cleanup task only need one slot even for high RAM jobs. This test case create a fake high RAM job with one map task and one reduce task. Each task require 2 slots. Then check that each heartbeat will schedule one task-cleanup task which need only one slot. So it need't to create dummy tracker status with FAILED_UNCLEAN tasks.
          The result of the change in MAPREDUCE-2207 is that task-cleanup tasks can't be scheduled to trackers with FAILED_UNCLEAN tasks to report during heartbeat, no matter the task failed on which tracker. This cause none task-cleanup task will be scheduled during heartbeat in the test case. The following code:

          List<Task> tasks = jobTracker.getSetupAndCleanupTasks(ttStatus);
          

          will always return null, only if ttStatus has tasks with FAILED_UNCLEAN status.

          Show
          Liyin Liang added a comment - Hi Todd, I think the test case testNumSlotsUsedForTaskCleanup is supposed to check that one task-cleanup task only need one slot even for high RAM jobs. This test case create a fake high RAM job with one map task and one reduce task. Each task require 2 slots. Then check that each heartbeat will schedule one task-cleanup task which need only one slot. So it need't to create dummy tracker status with FAILED_UNCLEAN tasks. The result of the change in MAPREDUCE-2207 is that task-cleanup tasks can't be scheduled to trackers with FAILED_UNCLEAN tasks to report during heartbeat, no matter the task failed on which tracker. This cause none task-cleanup task will be scheduled during heartbeat in the test case. The following code: List<Task> tasks = jobTracker.getSetupAndCleanupTasks(ttStatus); will always return null , only if ttStatus has tasks with FAILED_UNCLEAN status.
          Hide
          Todd Lipcon added a comment -

          Committed this to trunk, thanks Liyin!

          Show
          Todd Lipcon added a comment - Committed this to trunk, thanks Liyin!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #585 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/585/)
          MAPREDUCE-2271. Fix TestSetupTaskScheduling failure on trunk. Contributed by Liyin Liang.

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #585 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/585/ ) MAPREDUCE-2271 . Fix TestSetupTaskScheduling failure on trunk. Contributed by Liyin Liang.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #643 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/)

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #643 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/ )

            People

            • Assignee:
              Liyin Liang
              Reporter:
              Todd Lipcon
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development