Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.4.0
-
None
-
None
Description
The job tracker tries not to run tasks that have previously failed on a node on that node again, but it doesn't strictly prevent it.
I propose to change the rule so that when pollForNewTask is called by a TaskTracker, the JobTracker will only assign it a task that has failed on that TaskTracker, if and only if it has already failed on the entire cluster. Thus, for "normal" clusters with more than 4 TaskTrackers, you will be guaranteed that it will run on 4 different TaskTrackers. For small clusters, it will run on every TaskTracker in the cluster at least once.
Does that sound reasonable to everyone?