Uploaded image for project: 'Apache YuniKorn'
  1. Apache YuniKorn
  2. YUNIKORN-1173

Basic scheduling fails on an existing cluster

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Cannot Reproduce
    • None
    • None
    • shim - kubernetes
    • None

    Description

      Environment: EKS K8s 1.20. 
      K8shim built based on commit: https://github.com/apache/yunikorn-k8shim/commit/be3bb70d9757b27d0c40d446306b928c79c80a9f

      Core version used: v0.0.0-20220325135453-73d55282f052

      After YuniKorn is deployed, I deleted one of the pods managed by K8s deployment, but YK didn't schedule the new pod that's created: 

      spo-og60-03-spark-operator-86cc7ff747-9vzxl 

      is the name of the new pod. It's stuck in pending and its event said "spark-operator/spo-og60-03-spark-operator-86cc7ff747-9vzxl is queued and waiting for allocation"

      State dump and scheduler logs are attached

      Attachments

        1. logs.txt
          533 kB
          Chaoran Yu
        2. statedump.txt
          123 kB
          Chaoran Yu

        Activity

          People

            Unassigned Unassigned
            yuchaoran2011 Chaoran Yu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: