When you say knock off the ping thread I assume you really mean just the ping timeout check since the task progress happens in the same thread?
So the ping serves multiple purposes. Currently it notifies the AM that the task has "pinged" in and is still running. This could be useful even with taskTimeout since the taskTimeout could be turned off (set to 0) and we would never know if that task got hung. Second, the task uses it to check to see if the AM is still alive. If it doesn't return true, the task is supposed to exit. 1.X also had the ping check, but it went to the taskTracker and the tasktracker validated that the parent Task of the ping checker thread was still there.
Now with 0.23 the nodemanager is watching the processes and talking back to the RM to let it know that the AM died and if it died it kills the other tasks, but if the entire nodemanager goes down then the task doesn't know the AM went away. If the task isn't sending progress, and the task timeout is set to 0, and this is the last AM retry it could hang around forever.
The odds of that seem pretty small and I guess if we aren't worried about the first happening, the second probably isn't that interesting either. But we could also just remove the ping timeout check in the TaskHeartBeatHandler. What exactly are you proposing?