Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Delivered
-
None
-
None
-
None
Description
The following code in general.go handles the recovery part:
for _, pod := range appPods { log.Logger().Debug("Looking at pod for recovery candidates", zap.String("podNamespace", pod.Namespace), zap.String("podName", pod.Name)) // general filter passes, and pod is assigned // this means the pod is already scheduled by scheduler for an existing app if utils.GeneralPodFilter(pod) && utils.IsAssignedPod(pod) { if meta, ok := os.getAppMetadata(pod); ok { podsRecovered++ log.Logger().Debug("Adding appID as recovery candidate", zap.String("appID", meta.ApplicationID)) if _, exist := existingApps[meta.ApplicationID]; !exist { existingApps[meta.ApplicationID] = meta } ...
The crucial part is the handling of existingApps map. It's populated only once - however, there's no guarantee that all pods have the same tags or ownerReferences.
The scope of this JIRA is to analyze the possible side-effects of this code and come up with a better solution. A bug was already identified because of this (see YUNIKORN-1161).
Attachments
Issue Links
- is fixed by
-
YUNIKORN-2180 Clean up scheduler state initialization
- Closed
- relates to
-
YUNIKORN-1161 Pods not linked to placeholders are stuck in Running state if YK is restarted
- Closed