Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
The current implementation of YuniKorn is focused on the application and the states of the application. K8s does not and cannot provide details on what happens inside the application. This limits what we can offer at a YuniKorn level for applications.
To increase supportability, we need to understand what happens inside the core scheduler and how we got into a certain state.
Requirements:
- We want to record a stream of events in memory when something relevant happens which is related to the application or nodes:
- Partition changed (nodes added / removed, capacity changed, etc.)
- Application created / removed
- An ask is created / removed
- An allocation is created / removed
- Reservation occurs
- Placeholder is replaced, etc.
- The recorded events should be available from the REST interface
- The number of stored events can be limited by two settings: maximum number of events or expiration time (eg. events from the past 5 minutes).
- Take advantage of Go channels to avoid any potential blocking
Attachments
Issue Links
- blocks
-
YUNIKORN-2017 [Umbrella] YuniKorn application traceability - Phase 2
- Closed
- is related to
-
YUNIKORN-2442 Documentation update about the event system
- Resolved
- relates to
-
YUNIKORN-2115 [Umbrella] YuniKorn application traceability - phase II
- Closed
-
YUNIKORN-1702 Application is removed from Yunikorn UI upon completion, better to keep for additional time
- Closed