Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Not A Problem
-
1.10.0
-
None
-
None
Description
The current solution of stopping a TaskManager instance when JobManager sends a deletion request is by directly calling KubernetesClient.pods().withName().delete, thus that instance would be violently killed with a KILL signal and having no chance to clean up, which could cause problems because we expect the process to gracefully terminate when it is no longer needed.
Refer to the guide of Termination of Pods, we know that on Kubernetes a TERM signal would be first sent to the main process in each container, and may be followed up with a force KILL signal if the graceful shut-down period has expired; the Unix signal will be sent to the process which has PID 1 (Docker Kill), however, the TaskManagerRunner process is spawned by /opt/flink/bin/kubernetes-entry.sh and could never have PID 1, so it would never receive the TERM signal.
Attachments
Issue Links
- is blocked by
-
FLINK-17034 Execute the container CMD under TINI for better hygiene
- Open