Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Today if any container fails because of whatever reason, the app simply ignores them. We should handle retries, improve error reporting etc.
Attachments
Issue Links
- relates to
-
YARN-816 Implement AM recovery for distributed shell
- Open