Uploaded image for project: 'Apache YuniKorn'
  1. Apache YuniKorn
  2. YUNIKORN-2016 [Umbrella] K8Shim simplification
  3. YUNIKORN-1169

Fix ApplicationMetadata restoration during recovery

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Delivered
    • None
    • None
    • shim - kubernetes
    • None

    Description

      The following code in general.go handles the recovery part:

      	for _, pod := range appPods {
      		log.Logger().Debug("Looking at pod for recovery candidates", zap.String("podNamespace", pod.Namespace), zap.String("podName", pod.Name))
      		// general filter passes, and pod is assigned
      		// this means the pod is already scheduled by scheduler for an existing app
      		if utils.GeneralPodFilter(pod) && utils.IsAssignedPod(pod) {
      			if meta, ok := os.getAppMetadata(pod); ok {
      				podsRecovered++
      				log.Logger().Debug("Adding appID as recovery candidate", zap.String("appID", meta.ApplicationID))
      				if _, exist := existingApps[meta.ApplicationID]; !exist {
      					existingApps[meta.ApplicationID] = meta
      				}
      ...
      

      The crucial part is the handling of existingApps map. It's populated only once - however, there's no guarantee that all pods have the same tags or ownerReferences.

      The scope of this JIRA is to analyze the possible side-effects of this code and come up with a better solution. A bug was already identified because of this (see YUNIKORN-1161).

      Attachments

        Issue Links

          Activity

            People

              pbacsko Peter Bacsko
              pbacsko Peter Bacsko
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: