Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Today, in K8s. The resource quota is enforced by the quota admission controller. Resource quota is charged once a pod is created in a namespace, no matter it is running, pending (a pod only consumes resources when it is running), failed, or completed. The admission controller will reject further pods if all quota is exceeded.
When we run batch workloads with such quota, this can lead to some issues. Let's use Spark as an example: Spark job pods could be pending for many reasons, volume not ready, picky on a host, etc. Such pending pods will consume the resource quota. And subsequentially causing the resources can not be efficiently used.
The solution is to leverage YuniKorn elastic queues for quota management. Such elastic nature will be helpful to provide efficient resource sharing for batch workloads.