[MAPREDUCE-1682] Tasks should not be scheduled after tip is killed/failed. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: None
Component/s: jobtracker
Labels:
None

Description

We have seen the following scenario in our cluster:
A job got marked failed, because four attempts of a TIP failed. This would kill all the map and reduce tips. Then a job-cleanup attempt is launched.
The job-cleanup attempt failed because it could not report status for 10 minutes. There are 3 such job-cleanup attempts leading the job to get killed after 1/2 hour.
While waiting for the job cleanup to finish, JobTracker scheduled many tasks of the job on TaskTrackers and sent a KillTaskAction in the next heartbeat.

This is just wasting lots of resources, we should avoid scheduling tasks of a tip once the tip is killed/failed.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

mapreduce-1682-ydh.txt
18/Aug/10 23:45
0.8 kB
Todd Lipcon

Activity

People

Assignee:: Arun Murthy

Reporter:: Amareshwari Sriramadasu

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 07/Apr/10 04:37

Updated:: 30/Jul/14 17:31

Resolved:: 30/Jul/14 17:31