Uploaded image for project: 'Apache YuniKorn'
  1. Apache YuniKorn
  2. YUNIKORN-2465

Remove Task objects from the shim upon pod completion

    XMLWordPrintableJSON

Details

    Description

      We don't remove Task objects from the shim when the pod completes. This has consequences for long running workloads which keep generating new pods with the same applicationID such as Spark Streaming. The ever increasing memory usage eventually results in an OOM and the termination of Yunikorn. Tasks are only removed when the application reaches Completed state in the scheduler-core.

      Restart fixes the situation because completed pods are not restored and added to the Context/Application. We should remove the tasks during the lifetime of the application unless there's a good reason not to.

      Attachments

        1. remove_ask_alloc.poc
          4 kB
          Peter Bacsko

        Issue Links

          Activity

            People

              pbacsko Peter Bacsko
              pbacsko Peter Bacsko
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: