Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-5193

Recovering all jobs fails completely if a single recovery fails

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.1.3, 1.2.0
    • 1.1.4, 1.2.0
    • Runtime / Coordination
    • None

    Description

      In HA case where the JobManager tries to recover all submitted job graphs, e.g. when regaining leadership, it can happen that none of the submitted jobs are recovered if a single recovery fails. Instead of failing the complete recovery procedure, the JobManager should still try to recover the remaining (non-failing) jobs and print a proper error message for the failed recoveries.

      Attachments

        Activity

          People

            trohrmann Till Rohrmann
            trohrmann Till Rohrmann
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: