-
Type:
Bug
-
Status: Resolved
-
Priority:
Critical
-
Resolution: Fixed
-
Affects Version/s: None
-
Component/s: None
-
Labels:
-
Target Version/s:
-
Hadoop Flags:Reviewed
We found in one of our test cluster verification that the number attempt unregister events is about 300k+.
- AM all containers completed.
- AMRMClientImpl send finishApplcationMaster
- AMRMClient check event 100ms the finish Status using finishApplicationMaster request.
- AMRMClientImpl#unregisterApplicationMaster
while (true) { FinishApplicationMasterResponse response = rmClient.finishApplicationMaster(request); if (response.getIsUnregistered()) { break; } LOG.info("Waiting for application to be successfully unregistered."); Thread.sleep(100); }
- ApplicationMasterService finishApplicationMaster interface sends unregister events on every status update.
We should send unregister event only once and cache event send , ignore and send not unregistered response back to AM not overloading the event queue.