Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-4344 Implement new JobManager
  3. FLINK-6161

Retry connection in case of a ResourceManager heartbeat timeout

Agile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersConvert to IssueLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 1.3.0
    • 1.5.0
    • Runtime / Coordination

    Description

      The JobMaster should try reconnecting to the latest known resource manager leader in case of a resource manager heartbeat timeout. Otherwise the JobMaster will only try connecting to a ResourceManager if the leader address information change. In case of a false positive heartbeat timeout, this could break the JobMaster's connection to the ResourceManager permanently.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            zjwang Zhijiang
            trohrmann Till Rohrmann
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment