Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.0.0-Ducc
-
None
Description
When an Agent starts up it tries to cleanup the node of running processes that are associated with cgroups on the node. It reads pids of these process from each of active cgroups and kills them via -9.
All of this works fine except if there is a zombie process. This is a process that is dead but still appears in the OS process map. In this case the cgroup exist and the process is still associated with it. Killing such process with -9 has no effect as the process is already dead. The bug is that the agent goes into an infinite loop waiting for the zombie to go away and worse yet, it keeps logging at 200ms intervals.
Add detection of zombies and prevent logging.