Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-15843

Gracefully shutdown TaskManagers on Kubernetes

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Not A Problem
    • 1.10.0
    • None
    • None

    Description

      The current solution of stopping a TaskManager instance when JobManager sends a deletion request is by directly calling KubernetesClient.pods().withName().delete, thus that instance would be violently killed with a KILL signal and having no chance to clean up, which could cause problems because we expect the process to gracefully terminate when it is no longer needed.

      Refer to the guide of Termination of Pods, we know that on Kubernetes a TERM signal would be first sent to the main process in each container, and may be followed up with a force KILL signal if the graceful shut-down period has expired; the Unix signal will be sent to the process which has PID 1 (Docker Kill), however, the TaskManagerRunner process is spawned by /opt/flink/bin/kubernetes-entry.sh and could never have PID 1, so it would never receive the TERM signal.

       

       

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              felixzheng Canbin Zheng
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: