Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-3539

Sync running Execution and Task instances via heartbeats

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • None
    • Runtime / Coordination
    • None

    Description

      StephanEwen pointed out that it is possible for the job manager and task manager state to get out of sync. If for example a cancel message from the job manager to the task manager is not delivered, the Execution will be failed at the job manager, but the task will keep on running at the task manager.

      A simple way to prevent such situations is the following:

      • The task manager and job manager heartbeats add information about currently running tasks/executions
      • If a task manager reports a task, which is not a running Execution, that task is cancelled
      • If a job manager reports a running execution, which is not a running task, the execution is failed

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              uce Ufuk Celebi
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: