Uploaded image for project: 'Apache YuniKorn'
  1. Apache YuniKorn
  2. YUNIKORN-1628

[Umbrella] YuniKorn application traceability

    XMLWordPrintableJSON

Details

    Description

      The current implementation of YuniKorn is focused on the application and the states of the application. K8s does not and cannot provide details on what happens inside the application. This limits what we can offer at a YuniKorn level for applications.

      To increase supportability, we need to understand what happens inside the core scheduler and how we got into a certain state.

      Requirements:

      1. We want to record a stream of events in memory when something relevant happens which is related to the application or nodes:
        • Partition changed (nodes added / removed, capacity changed, etc.)
        • Application created / removed
        • An ask is created / removed
        • An allocation is created / removed
        • Reservation occurs
        • Placeholder is replaced, etc.
      2. The recorded events should be available from the REST interface
      3. The number of stored events can be limited by two settings: maximum number of events or expiration time (eg. events from the past 5 minutes).
      4. Take advantage of Go channels to avoid any potential blocking

       

      Attachments

        Issue Links

          There are no Sub-Tasks for this issue.

          Activity

            People

              pbacsko Peter Bacsko
              pbacsko Peter Bacsko
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: