Uploaded image for project: 'Apache YuniKorn'
  1. Apache YuniKorn
  2. YUNIKORN-201

[Umbrella] Application tracking API and CRD - phase 1

Details

    Description

      Today, YK works behind the scene, and the workflow is like

      1. app operator or job server launch a bunch of pods on K8s
      2. YK gets notified and group pods to apps based on appID
      3. YK schedules the pods with respect to the app info

      This provides a simple model to integrate with existing K8s and to support workloads, but it has some user experience issues. Such as

      1. YK can hardly manage the app lifecycle end to end. An outstanding issue is we do not know when an app is finished if we only look at the pod status. 
      2. YK doesn't have ability to admit apps. We need the ability to admit app based on various conditions, e.g resource quota, cluster overhead, ACL, etc. 
      3. Hard to track app status. Sometimes app might be pending in resource queues, but we do not have a good way to expose such status info.

      To further improve the user experience, we need to introduce an application tracking API and K8s custom resource definition (CRD). The CRD will be used by app operator/job server to interact with YK, to get the lifecycle fully controlled.

      Attachments

        Issue Links

          Activity

            tingyao TingYao Huang added a comment - - edited

            When I try the integration between yunikorn and spark operator, some issue happened.

            Even sparkApplication is in completed status, our yunikorn application CRD will stuck at Running.

            And in web UI, that application also in Running status, but no used resource.

            Perhaps it was cause by yunikorn-26.

            tingyao TingYao Huang added a comment - - edited When I try the integration between yunikorn and spark operator, some issue happened. Even sparkApplication is in completed status, our yunikorn application CRD will stuck at Running. And in web UI, that application also in Running status, but no used resource. Perhaps it was cause by  yunikorn-26 .
            kmarton Kinga Marton added a comment - - edited

            I think we should raise the priority of YUNIKORN-26 and fix it. wwei, wilfreds any thoughts?

            kmarton Kinga Marton added a comment - - edited I think we should raise the priority of YUNIKORN-26 and fix it. wwei , wilfreds  any thoughts?
            wwei Weiwei Yang added a comment -

            hi Huang Ting Yao, kmarton

            In this case, I am expecting the state should be "Waiting" instead of "Running", according to http://yunikorn.apache.org/docs/next/design/scheduler_object_states. Not sure if there is any bug here.
            This is caused by YUNIKORN-26, but the problem is a bit complicated... the reason is YuniKorn doesn't know if the job has been completed/failed/succeed, only the operator knows that. Internally, spark-k8s-operator monitors the spark driver/executor pods, and changes SparkApplication state based on some conditions. This is per-app logic, that can never be coded into YuniKorn. Based on these things, I'd propose:

            1. Change the State field in app-CRD to "scheduling state", to indicate this only reflects the state in the scheduler
            2. Make sure when there is no allocation in an app, make sure the app state is "Waiting".
            3. When SparkApplication is deleted, delete the app-CRD as well. And then remove this app from the scheduler.
            wwei Weiwei Yang added a comment - hi Huang Ting Yao , kmarton In this case, I am expecting the state should be "Waiting" instead of "Running", according to http://yunikorn.apache.org/docs/next/design/scheduler_object_states . Not sure if there is any bug here. This is caused by YUNIKORN-26 , but the problem is a bit complicated... the reason is YuniKorn doesn't know if the job has been completed/failed/succeed, only the operator knows that. Internally, spark-k8s-operator monitors the spark driver/executor pods, and changes SparkApplication state based on some conditions. This is per-app logic, that can never be coded into YuniKorn. Based on these things, I'd propose: Change the State field in app-CRD to "scheduling state", to indicate this only reflects the state in the scheduler Make sure when there is no allocation in an app, make sure the app state is "Waiting". When SparkApplication is deleted, delete the app-CRD as well. And then remove this app from the scheduler.
            kmarton Kinga Marton added a comment - - edited

            wwei,

            Change the State field in app-CRD to "scheduling state", to indicate this only reflects the state in the scheduler

            I can change the appstatus to scheduling state, but I would keep the Status, because this is a predefined subresource. By changing this we will have the Status in the following format: 

            status:           
              type: object
              properties:             
                scheduling_status:               
                  type: string
                message:               
                  type: string
                lastupdate:               
                  type: string
            

            Make sure when there is no allocation in an app, make sure the app state is "Waiting".

             If we change the status in the CRD, we will need to make changes in the core side as well, since we agreed that the source of truth will be the core side state and I don't think it is a good idea to make an exception for this case. 

            When SparkApplication is deleted, delete the app-CRD as well. And then remove this app from the scheduler.

            I think we can handle this issue with YUNIKORN-266, where we will delete the related pods as well, when the application is deleted.

            Huang Ting Yao, is it possible to set the SparkApplication as ownerRefference for the CRD, so it should be deleted when the SparkApplication is deleted? 

            Related YUNIKORN-26, I still think that we should allocate some time to fix that issue, because until we have that issue, I have the impression that we don't have a stable basement for the application handling. 

            kmarton Kinga Marton added a comment - - edited wwei , Change the State field in app-CRD to "scheduling state", to indicate this only reflects the state in the scheduler I can change the appstatus to scheduling state, but I would keep the Status, because this is a predefined subresource. By changing this we will have the Status in the following format:  status: type: object properties: scheduling_ status: type: string message: type: string lastupdate: type: string Make sure when there is no allocation in an app, make sure the app state is "Waiting".  If we change the status in the CRD, we will need to make changes in the core side as well, since we agreed that the source of truth will be the core side state and I don't think it is a good idea to make an exception for this case.  When  SparkApplication  is deleted, delete the app-CRD as well. And then remove this app from the scheduler. I think we can handle this issue with  YUNIKORN-266 , where we will delete the related pods as well, when the application is deleted. Huang Ting Yao , is it possible to set the SparkApplication as ownerRefference for the CRD, so it should be deleted when the SparkApplication is deleted?  Related YUNIKORN-26 , I still think that we should allocate some time to fix that issue, because until we have that issue, I have the impression that we don't have a stable basement for the application handling. 
            adam.antal Adam Antal added a comment -

            One question here that is not clear to me is that will there be a "finished" status for the App CRD? From your latest comment I assumed that when the app is finished and delete the CRD the status will be changed from "Waiting" to none, because the app will no longer exist.

            adam.antal Adam Antal added a comment - One question here that is not clear to me is that will there be a "finished" status for the App CRD? From your latest comment I assumed that when the app is finished and delete the CRD the status will be changed from "Waiting" to none, because the app will no longer exist.
            kmarton Kinga Marton added a comment -

            adam.antal, I am not sure that I got your point. 

            when the app is finished and delete the CRD

            why would we need a status if we will delete the CRD? The status of the CRD must be the same every time as the status of the application in core side.

            kmarton Kinga Marton added a comment - adam.antal , I am not sure that I got your point.  when the app is finished and delete the CRD why would we need a status if we will delete the CRD? The status of the CRD must be the same every time as the status of the application in core side.
            tingyao TingYao Huang added a comment - - edited

            kmarton, the SparkApplication has already been set as owner Reference when spark-operator create yunikorn CRD.

            So when we delete SparkApplication, the yunikorn CRD will also be deleted.

            tingyao TingYao Huang added a comment - - edited kmarton , the SparkApplication has already been set as owner Reference when spark-operator create yunikorn CRD. So when we delete SparkApplication, the yunikorn CRD will also be deleted.
            wwei Weiwei Yang added a comment - - edited

            I can change the appstatus to scheduling state, but I would keep the Status, because this is a predefined subresource.

            OK. This is not a MUST from what I can see. Changing to SchedulingState is merely to avoid giving the user the impression this gives the source of truth about app states. Where I can see this will confuses a lot of people. We need to document this carefully, hope this makes sense to the users.

            will there be a "finished" status for the App CRD

            Unfortunately, we will not be able to set a "finished" state in the app CRD today. Only app operators understand when an app is finished/completed. In the scheduler, we could not tell that based on the info we have in the scheduler. E.g we cannot assume a job is completed if there is no pod running, a good example is for SchedulerSparkApplication, after one job succeed and before 2nd job launched, there is no pod running but the app is not finished.

            So the fix for YUNIKORN-26 won't be that easy. We have to introduce a way to get feedback from app operators and observe when an app is finished, we notice this to the scheduler-core and change the state accordingly. The logic can be different for different apps. I doubt that's something we want to do. Instead, I suggest to simply track the scheduling state in our CRD.

            when the app is finished and delete the CRD the status will be changed from "Waiting" to none

            When app is deleted, we will make sure the corresponding app-CRD is also deleted. And subsequentially we will delete the app from the scheduler. We do not need to change the state in this case.

            wwei Weiwei Yang added a comment - - edited I can change the appstatus to scheduling state, but I would keep the Status, because this is a predefined subresource. OK. This is not a MUST from what I can see. Changing to SchedulingState is merely to avoid giving the user the impression this gives the source of truth about app states. Where I can see this will confuses a lot of people. We need to document this carefully, hope this makes sense to the users. will there be a "finished" status for the App CRD Unfortunately, we will not be able to set a "finished" state in the app CRD today. Only app operators understand when an app is finished/completed. In the scheduler, we could not tell that based on the info we have in the scheduler. E.g we cannot assume a job is completed if there is no pod running, a good example is for SchedulerSparkApplication, after one job succeed and before 2nd job launched, there is no pod running but the app is not finished. So the fix for YUNIKORN-26 won't be that easy. We have to introduce a way to get feedback from app operators and observe when an app is finished, we notice this to the scheduler-core and change the state accordingly. The logic can be different for different apps. I doubt that's something we want to do. Instead, I suggest to simply track the scheduling state in our CRD. when the app is finished and delete the CRD the status will be changed from "Waiting" to none When app is deleted, we will make sure the corresponding app-CRD is also deleted. And subsequentially we will delete the app from the scheduler. We do not need to change the state in this case.

            People

              kmarton Kinga Marton
              wwei Weiwei Yang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: