Description
When a job fails the AM hangs during shutdown. A non-daemon thread pool executor thread prevents the JVM teardown from completing, and the AM lingers on the cluster for the AM expiry interval in the FINISHING state until eventually the RM expires it and kills the container. If application limits on the queue are relatively low (e.g.: small queue or small cluster) this can cause unnecessary delays in resource scheduling on the cluster.
Attachments
Attachments
Issue Links
- is broken by
-
MAPREDUCE-5317 Stale files left behind for failed jobs
- Closed
- is duplicated by
-
YARN-2283 RM failed to release the AM container
- Resolved