Details
Description
The AM does not have any flow control to limit the incoming rate of events from tasks. If the AM is unable to keep pace with the rate of incoming events for a sufficient period of time then it will eventually exhaust the heap and crash. MAPREDUCE-5043 addressed a major bottleneck for event processing, but the AM could still get behind if it's starved for CPU and/or handling a very large job with tens of thousands of active tasks.
Attachments
Attachments
Issue Links
- breaks
-
MAPREDUCE-7028 Concurrent task progress updates causing NPE in Application Master
- Resolved
-
MAPREDUCE-7020 Task timeout in uber mode can crash AM
- Resolved
-
MAPREDUCE-7053 Timed out tasks can fail to produce thread dump
- Resolved
- is related to
-
MAPREDUCE-6242 Progress report log is incredibly excessive in application master
- Resolved
- relates to
-
YARN-270 RM scheduler event handler thread gets behind
- Open
-
MAPREDUCE-5043 Fetch failure processing can cause AM event queue to backup and eventually OOM
- Closed
-
YARN-3630 YARN should suggest a heartbeat interval for applications
- Open