Description
TestMockDAGAppMaster.testInternalPreemption intermittently fails with expected:<KILLED> but was:<FAILED>
Crux of the matter is TaskSchedulerManager sends two events
- TaskScheduler#deallocatedContainer->TaskSchedulerManager#containerBeingReleased->Sends AMContainerStopRequest -> TA_CONTAINER_TERMINATING
- AMContainerEventCompleted -> TA_CONTAINER_TERMINATED_BY_SYSTEM
In order to kill a task attempt correctly the second message loop must complete first. The first path is longer so the second message loop completes almost always first. When the first message loop completes first, then the task attempt is marked as FAILED and not KILLED.
Attachments
Attachments
Issue Links
- is related to
-
TEZ-4036 TestMockDAGAppMaster#testInternalPreemption should assert for failed state
- Closed