Thanks Arun Suresh for updating the patch.
I think the latest patch is much better than the previous one, but I found we can merge logics of execution type update and resizing even more. And we can make call flow of allocation more clear:
High level comments:
1) Can we uniform call flow of OpportunisticContainerAllocatorAMService (OCAAMS) and normal AMS?
Currently, OCAAMS gets SchedulerApplicationAttempt, and then directly update request to ContainerUpdateContext. And for increase/decrease request, they will be added to Scheduler.allocate, and set through AppSchedulingInfo#updateResourceRequests.
I think the two could be merged: Since we have a unified field of UpdateContainerRequest inside AllocateRequest, and we calls RMServerUtils#validateAndSplitUpdateResourceRequests from ApplicationMasterService, so we should be able to split increase/decrease/demote/promote requests. Then increase / promote requests will be converted to normal resource requests and add to AppSchedulingInfo (via Scheduler#allocate).
Inside AppSchedulingInfo, it could uses ContainerUpdateContext to maintain all container-update-related requests.
2) Similar to above, it's better to unify call flow of fetch container update result as well:
Currently it calls appAttempt.pullContainersWithUpdatedExecType, I think it should be merged to FiCaSchedulerApp#getAllocation. And actually, we should remove increasedContainers/decreasedContainers, and instead we should have: List<Entry<ContainerUpdateType, List<Container>> to capture all kinds of container updates.
Summary of my opinions about call flow:
a. Send update container request to scheduler:
For increase/decrease container: Inside ApplicationMasterService, calls scheduler#allocate (same as today)
For promotion/demotion container:
OCAAMS split request to two parts:
- List of opportunistic container allocations: opportunistic allocations Requests goes to OpportunisticContainerAllocator (same as today)
- UpdateContainerRequests (promotion and demotion) goes to ApplicationMasterService
Minor refactoring may require for Scheduler#allocate's parameters, since we can merge all container update requests to a single one. like List<Entry<ContainerUpdateType, List<ContainerUpdateRequest>>.
b. Get update container response from scheduler:
For both increase/decrease/promotion/demotion containers, they will be set to Allocation by SchedulerApplicationAttempt implementations.
3) Can we avoid changing AbstractResourceRequest (assume we may remove it on
YARN-6022)? Adding a updatedContainerId to ResourceRequest is very confusing to application, because they may set ResourceRequest#updatedContainerId to update a container, but it won't work.
I believe we should have enough context (like the ContainerId inside SchedulerRequestKey) already, could you reconsider this part?
4) ContainerUpdateType: It's better to have separated promotion / demotion instead of a single "update-execution-type".