Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-5193

Recovering all jobs fails completely if a single recovery fails

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.2.0, 1.1.3
    • Fix Version/s: 1.2.0, 1.1.4
    • Component/s: JobManager
    • Labels:
      None

      Description

      In HA case where the JobManager tries to recover all submitted job graphs, e.g. when regaining leadership, it can happen that none of the submitted jobs are recovered if a single recovery fails. Instead of failing the complete recovery procedure, the JobManager should still try to recover the remaining (non-failing) jobs and print a proper error message for the failed recoveries.

        Attachments

          Activity

            People

            • Assignee:
              till.rohrmann Till Rohrmann
              Reporter:
              till.rohrmann Till Rohrmann
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: