Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-339

JobTracker should give preference to failed tasks over virgin tasks so as to terminate the job ASAP if it is eventually going to fail.

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Case in point... I have 1585 maps and 160 slots (40 nodes). The job is such that all maps fail within 2-3 minutes. The job takes forever to realise that the job is bad. It took 2526 failures for it to reach 4 failed attempts for a task.

      As I understand, currently the JT prefers a failed task if and only if a task tracker with a split replica for that map came asking for a task. In fact there may not be a single TT at all in the mapred cluster which has a replica for the splits used in this job (pre-0.20). This delays the job failure by a lot and hence degrades cluster utilization as a whole. If i'm on a shared cluster with many jobs waiting on it to fail, it's bad.

      The JT should prefer a failed task a lot earlier than waiting for a data local TT to come around asking.

      1. MAPREDUCE-339-v1.2.patch
        10 kB
        Amar Kamat
      2. M339-0y20s.patch
        16 kB
        Chris Douglas

        Activity

        Hide
        Gautam Kowshik added a comment -

        Would it make sense to provide a feature to be able to force a kill from within the job? Once the user's mapreduce job detects that it has reached a state after which it can't resume, it can hint/force the JT to end this job, an emergency button of sorts. This would empower the user implementations to get out of the bad jobs asap and achieve better cluster utilization.

        Show
        Gautam Kowshik added a comment - Would it make sense to provide a feature to be able to force a kill from within the job? Once the user's mapreduce job detects that it has reached a state after which it can't resume, it can hint/force the JT to end this job, an emergency button of sorts. This would empower the user implementations to get out of the bad jobs asap and achieve better cluster utilization.
        Hide
        Gautam Kowshik added a comment -

        To add, the job I was running took 1hrs, 22mins, 13sec to fail even though the each map fails immediately, within 2-3 minutes

        Show
        Gautam Kowshik added a comment - To add, the job I was running took 1hrs, 22mins, 13sec to fail even though the each map fails immediately, within 2-3 minutes
        Hide
        Amar Kamat added a comment -

        I would propose

        1. upon a task failure, increment the failure count of the task
        2. after few failures (threshold), the task gets added to the global failure list, which gets the highest priority across all tasks. So this way we respect the locality but then later slowly respect the failed tasks.

        Thoughts?

        Show
        Amar Kamat added a comment - I would propose upon a task failure, increment the failure count of the task after few failures (threshold), the task gets added to the global failure list, which gets the highest priority across all tasks. So this way we respect the locality but then later slowly respect the failed tasks. Thoughts?
        Hide
        Vinod Kumar Vavilapalli added a comment -

        Shouldn't we also sort the local tasks also in the reverse order of failure counts?

        Show
        Vinod Kumar Vavilapalli added a comment - Shouldn't we also sort the local tasks also in the reverse order of failure counts?
        Hide
        Amar Kamat added a comment -

        Vinod, we already do that. When a node/rack-local task fails, we put it in the front of the queue and hence it gets picked up first.

        Show
        Amar Kamat added a comment - Vinod, we already do that. When a node/rack-local task fails, we put it in the front of the queue and hence it gets picked up first.
        Hide
        Amar Kamat added a comment -

        Also we might want to be greedy and schedule failed task over virgin tasks as soon as we find out that the almost all the tasks for a job is failing. Something like

        if (num-attempts-scheduled > x% of total-tasks  && num-attempts-failed/num-attempts-scheduled > y%) then
        schedule failed tasks greedily
        

        Thoughts?

        Show
        Amar Kamat added a comment - Also we might want to be greedy and schedule failed task over virgin tasks as soon as we find out that the almost all the tasks for a job is failing. Something like if (num-attempts-scheduled > x% of total-tasks && num-attempts-failed/num-attempts-scheduled > y%) then schedule failed tasks greedily Thoughts?
        Hide
        Amar Kamat added a comment -

        Attaching a patch that simply queues up failed task and first checks for failed task before scheduling any task. Result of test-patch
        [exec] +1 overall.
        [exec]
        [exec] +1 @author. The patch does not contain any @author tags.
        [exec]
        [exec] +1 tests included. The patch appears to include 3 new or modified tests.
        [exec]
        [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
        [exec]
        [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
        [exec]
        [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
        [exec]
        [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

        Testing in progress.

        Show
        Amar Kamat added a comment - Attaching a patch that simply queues up failed task and first checks for failed task before scheduling any task. Result of test-patch [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Testing in progress.

          People

          • Assignee:
            Devaraj Das
            Reporter:
            Gautam Kowshik
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:

              Development