Thanks Arun for reviewing and suggestion.
There are few problems around this.
1. If the RM doesn't give application report it is giving NullPointerException.
This can be handled by redirecting to history server as it still may aware of the application.
2. After redirecting to History Server, if the history server doesn't have information about it(or it failed to give because of some other reason), it is going to infinite loop and keep on printing the message.
I have faced the similar problem. RM is giving the application report with status as success and then it is redirecting to History server. History server is not able to find the application info, it throwing the exception. That is converting to InvocationTargetException and it is retrying infinitely.
3. If it throws other than 'YarnRemoteException' and 'InvocationTargetException' also it goes to infinite times. This needs to break at some point.
Here we need to differentiate remote end exceptions and connection failures to RM/AM/HS, if it is remote end exception then it can be reported directly. If it is connection failure then retry can happen in the RPC and after retries it can be reported.
Please provide your suggestions.