The cause of this problem is that the update of task/taskAttempt states is not in sync with that of TaskAttemptCompletionEvent of the job due to the async nature of the dispatcher dispatching the events. If the async dispatcher is delayed long enough, we will have cases where the task/taskAttempt states have changed, but the TaskAttemptCompletionEvent is still stale. We could wait for a maximum amount of time for the update of TaskAttemptCompletionEvents. Only in extreme cases when the update is delayed longer than our wait time, will this test fail and cause a false alarm. Please see my preliminary patch.