Uploaded image for project: 'Apache YuniKorn'
  1. Apache YuniKorn
  2. YUNIKORN-2031

metric yunikorn_root_.*_resource{state="pending"} is inaccurate

    XMLWordPrintableJSON

Details

    Description

      Given a sample static queue configmap yaml:

       

      apiVersion: v1
      data:
        queues.yaml: |
          partitions:
          - name: default
            queues:
            - name: root
              submitacl: '*'
              queues:
              - name: dev
                resources:
                  max:
                    memory: 20Gi
                    vcore: "5"
              - name: sre
                resources:
                  max:
                    memory: 20Gi
                    vcore: "5"
      kind: ConfigMap
      metadata:
        name: yunikorn-configs
        namespace: default 

       

      and a Pod template yaml like:

       

      kind: Pod
      apiVersion: v1
      metadata:
        generateName: dev-
        labels:
          applicationId: dev
          queue: root.dev
      spec:
        schedulerName: yunikorn
        containers:
        - name: pause
          image: registry.k8s.io/pause
          resources:
            requests:
              cpu: 1
            limits:
              cpu: 1 

       

      If I create 6 pods:

       

      for i in {1..6}; do k create -f pods/dev-pod.yaml; done 

      5 Pods are in Running state, 1 is in Pending state. Which looks good as dev queue's max quota is 5 cpus.

       

      However, if I check the metric yunikorn_root_dev_queue_resource, it shows incorrect pending pods/resource:

      Metric Value
      yunikorn_root_dev_queue_resource {instance="localhost:9080", job="yunikorn", resource="pods", state="allocated"} 5
      yunikorn_root_dev_queue_resource {instance="localhost:9080", job="yunikorn", resource="pods", state="pending"} 2
      yunikorn_root_dev_queue_resource {instance="localhost:9080", job="yunikorn", resource="vcore", state="allocated"} 5000
      yunikorn_root_dev_queue_resource {instance="localhost:9080", job="yunikorn", resource="vcore", state="pending"} 2000

      Attachments

        Issue Links

          Activity

            People

              weihuang Wei Huang
              weihuang Wei Huang
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: