I agree that the original desire for this patch was born of the TaskTracker timeouts that shouldn't happen. Fixing those problems (and we have fixed most of them over the last 4 months) should take precendence. However, that said, I think in the long term, we do want something like this patch. If a switch goes down for 15 minutes and then comes back up, it does not make sense to reshuffle, resort, and rerun a reduce that takes hours to run.
All map/reduce applications, even those with speculative execution turned off, must permit redundant copies of their tasks for precisely this reason. In this case, the JobTracker has decided a given task is dead, but hasn't been able to tell the responsible TaskTracker yet. Therefore it schedules another instance of the failed task on a different node. Therefore, they are going to run in parallel for a while.
I guess for now, let's sit on this patch and contemplate what the model should be for dealing with communication problems. We should also monitor this in real use and see how often task trackers are being lost and probably put some effort to determine at least whether it is the job tracker or the task tracker that is the cause of the delay.