Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-36057

SPIP: Support Customized Kubernetes Schedulers

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsAdd voteVotersStop watchingWatchersCreate sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • 3.3.0
    • None
    • Kubernetes

    Description

      This is an umbrella issue for tracking the work for supporting Volcano & Yunikorn on Kubernetes. These schedulers provide more YARN like features (such as queues and minimum resources before scheduling jobs) that many folks want on Kubernetes.

       

      Yunikorn is an ASF project & Volcano is a CNCF project (sig-batch).

       

      They've taken slightly different approaches to solving the same problem, but from Spark's point of view we should be able to share much of the code.

       

      See the initial brainstorming discussion in SPARK-35623.

       

      DISCUSSION: https://lists.apache.org/thread/zv3o62xrob4dvgkbftbv5w5wy75hkbxg

      VOTE: https://lists.apache.org/thread/cz3cpp8q4pgmh7h35h6lvkwf6g3lwhcd

      VOTE Result: https://lists.apache.org/thread/nvwfo0yo0q8997vs86o7wkjyby4tbp0m

      Design DOC: https://docs.google.com/document/d/1xgQGRpaHQX6-QH_J9YV2C2Dh6RpXefUpLM7KGkzL6Fg

      Recap slide: https://lists.apache.org/thread/mwswfwkycj71npwz8gmv1r5nrvpwj77s

      Attachments

        Issue Links

        1.
        Support replicasets/job API Sub-task Resolved Holden Karau Actions
        2.
        Add the ability to specify a scheduler Sub-task Resolved Yikun Jiang Actions
        3.
        Support for specifiying executor/driver node selector Sub-task Resolved Yikun Jiang Actions
        4.
        Add the ability to create resources before driver pod Sub-task Resolved Yikun Jiang Actions
        5.
        Add appId interface to KubernetesConf Sub-task Resolved Yikun Jiang Actions
        6.
        Add KubernetesCustom[Driver/Executor]FeatureConfigStep developer API Sub-task Resolved Yikun Jiang Actions
        7.
        Upgrade kubernetes-client to 5.12.0 Sub-task Resolved Yikun Jiang Actions
        8.
        Upgrade kubernetes-client to 5.12.2 Sub-task Resolved Yikun Jiang Actions
        9.
        Add `volcano` module and feature step Sub-task Resolved Yikun Jiang Actions
        10.
        Support queue scheduling (Introduce queue) with volcano implementations Sub-task Resolved Yikun Jiang Actions
        11.
        Add volcano section to K8s IT README.md Sub-task Resolved Yikun Jiang Actions
        12.
        Support priority scheduling with volcano implementations Sub-task Resolved Yikun Jiang Actions
        13.
        Bump minimum Volcano version to v1.5.1 Sub-task Resolved Yikun Jiang Actions
        14.
        Fix Volcano weight to be positive integer and use cpu capability instead Sub-task Resolved Yikun Jiang Actions
        15.
        Support APP_ID and EXECUTOR_ID placeholder in annotations Sub-task Resolved Dongjoon Hyun Actions
        16.
        Support driver/executor PodGroup templates Sub-task Resolved Dongjoon Hyun Actions
        17.
        Support resource reservation (Introduce minCPU/minMemory) with volcano implementations Sub-task Resolved Dongjoon Hyun Actions
        18.
        Remove spark.kubernetes.job.queue in favor of spark.kubernetes.driver.podGroupTemplateFile Sub-task Resolved Dongjoon Hyun Actions
        19.
        Set the minimum Volcano version Sub-task Resolved Dongjoon Hyun Actions
        20.
        Remove priorityClassName propagation in favor of explicit settings Sub-task Resolved Dongjoon Hyun Actions
        21.
        Move custom scheduler-specific configs to under `spark.kubernetes.scheduler.NAME` prefix Sub-task Resolved Dongjoon Hyun Actions
        22.
        Unify Statefulset* to StatefulSet* Sub-task Resolved Dongjoon Hyun Actions
        23.
        Volcano feature doesn't work on EKS graviton instances Sub-task Resolved Yikun Jiang Actions
        24.
        Volcano queue is not deleted Sub-task Resolved Yikun Jiang Actions
        25.
        Introduce `spark.kubernetes.job` sheduling related configurations Sub-task Closed Unassigned Actions
        26.
        Support backing off dynamic allocation increases if resources are "stuck" Sub-task Closed Unassigned Actions
        27.
        [Deprecated] Support the Volcano Job API Sub-task Closed Unassigned Actions
        28.
        Check resource after resource creation Sub-task Closed Unassigned Actions
        29.
        [CI] Introduce Spark on Kubernetes CI into Volcano community Sub-task Closed Unassigned Actions
        30.
        Add doc for "Customized Kubernetes Schedulers" Sub-task Resolved Yikun Jiang Actions
        31.
        Add fair-share scheduling integration test Sub-task Open Unassigned Actions
        32.
        Add doc for Volcano scheduler Sub-task Resolved Yikun Jiang Actions
        33.
        Add yunikorn feature step Sub-task In Progress Unassigned Actions
        34.
        Support job queue in YuniKorn feature step Sub-task Reopened Unassigned Actions
        35.
        Add volcano module to release-build.sh Sub-task In Progress Unassigned Actions
        36.
        Fix doc format/syntax error Sub-task Resolved Yikun Jiang Actions

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned Assign to me
            holden Holden Karau
            Holden Karau Holden Karau

            Dates

              Created:
              Updated:

              Slack

                Issue deployment