Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-13060

FailoverStrategies should respect restart constraints

Agile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Hide
      Users that have enabled the "region" failover strategy, along with a restart strategy that enforces a certain number of restarts or introduces a restart delay, will see changes in behavior. This failover strategy now respects constraints that are defined by the restart strategy.
      Show
      Users that have enabled the "region" failover strategy, along with a restart strategy that enforces a certain number of restarts or introduces a restart delay, will see changes in behavior. This failover strategy now respects constraints that are defined by the restart strategy.

    Description

      RestartStrategies can define their own restrictions for whether job can be restarted or not. For example, they could count the number of total failures or observe failure rates.

      FailoverStrategies are used for partial restarts of jobs, and currently largely bypass the restrictions defined by the restart strategies.

      My proposal is the following:

      Introduce a new method into the RestartStrategy interface to notify the strategy of failed task executions. Currently, strategies implicitly handle this in RestartStrategy#restart, as such the migration of our existing strategies should be trivial.

      Next, before calling RestartStrategy#restart, inform the strategy about the task failure. This retains existing behavior.
      Additionally, the FailoverStrategy implementation may additionally inform the restart strategy about task failures, if and when they perform a local failover. Additionally, all implementation have to check RestartStrategy#canRestart before attempting a failover.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            chesnay Chesnay Schepler
            chesnay Chesnay Schepler
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - Not Specified
              Not Specified
              Remaining:
              Remaining Estimate - 0h
              0h
              Logged:
              Time Spent - 20m
              20m

              Slack

                Issue deployment