Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
When pods are submitted to YuniKorn without an associated applicationId label, the admission controller assigns a generated applicationId for that pod. However, if this pod is the first (or only) pod submitted, the generated application will block other applications from executing until a second pod is submitted, or until 5 minutes have elapsed. Since we have no way to know if another pod will be scheduled for the generated application, we should have a way to skip state-aware scheduling in this case and avoid the 5 minute delay.
To fix this:
1) Scheduler core: Add a new tag "application.stateaware.disable" which if present, will prevent waiting for a second task to transition to Running state.
2) Admission controller: Add new label "disableStateAware: true" to a pod if neither applicationId nor spark-app-selector is provided.
3) K8S Shim: When creating a New application in the core, if the disableStateAware label exists and is true, set the "application.stateaware.disable" tag.
In addition to bypassing state-aware scheduling for generated apps, the addition of this label / tag also gives users a mechanism to opt out on a per-application basis from state-aware scheduling if necessary, such as an application containing only a single pod, or one where a second pod may not be launched in a timely manner.