Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
None
-
Reviewed
Description
try { response = client.allocate(progress); } catch (ApplicationAttemptNotFoundException e) { handler.onShutdownRequest(); LOG.info("Shutdown requested. Stopping callback."); return;
is a code snippet from AMRMClientAsyncImpl. The corresponding onShutdownRequest call for the Distributed Shell App master,
@Override public void onShutdownRequest() { done = true; }
Due to the above change, the current behavior is that whenever an application attempt fails due to a NM restart (NM where the DS AM is running), an ApplicationAttemptNotFoundException is thrown and all containers for that attempt including the ones that are running on other NMs are killed by the AM and marked as COMPLETE. The subsequent attempt spawns new containers just like a new attempt. This behavior is different to a Map Reduce application where the containers are not killed.
cc rohithsharma
Attachments
Attachments
Issue Links
- is related to
-
YARN-759 Create Command enum in AllocateResponse
- Closed