Details
-
Task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
We are seeing service stuck when latency increases, even cluster has resource, YuniKorn will not be able to schedule apps. We have to manually restart YuniKorn.
we did profiling to find out most time are used by tryReservedAllocate.
Attached ** profiling screenshot and service latency data.