Uploaded image for project: 'Apache YuniKorn'
  1. Apache YuniKorn
  2. YUNIKORN-1187 [Umbrella] Recovery stabilization
  3. YUNIKORN-1217

Ensure that Spark driver pod is processed before executor pods during recovery

    XMLWordPrintableJSON

Details

    Description

      When running a Spark workload with gang scheduling, the driver and executor pods have different annotations.

      It is critical that we process the driver first, because it has the task group definitions. Based on https://yunikorn.apache.org/docs/next/user_guide/gang_scheduling/, the executor only needs yunikorn.apache.org/taskGroupName.

      So when we add the pods in the recovery code path, we have to start with the driver.

      Attachments

        Issue Links

          Activity

            People

              pbacsko Peter Bacsko
              pbacsko Peter Bacsko
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: