Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-23202

RpcService should fail result futures if messages could not be sent

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Hide
      The same way Flink detects unreachable heartbeat targets faster, Flink now also immediately fails RPCs where the target is known by the OS to be unreachable on a network level, instead of waiting for a timeout (akka.ask.timeout).

      One effect this are faster task failovers, because cancelling tasks on a dead TaskExecutor no longer gets delayed by the RPC timeout.

      If this faster failover is a problem in certain setups (which might rely on the fast that external systems hit timeouts), we recommend to configure the application's restart strategy with a restart delay.
      Show
      The same way Flink detects unreachable heartbeat targets faster, Flink now also immediately fails RPCs where the target is known by the OS to be unreachable on a network level, instead of waiting for a timeout (akka.ask.timeout). One effect this are faster task failovers, because cancelling tasks on a dead TaskExecutor no longer gets delayed by the RPC timeout. If this faster failover is a problem in certain setups (which might rely on the fast that external systems hit timeouts), we recommend to configure the application's restart strategy with a restart delay.

    Description

      The RpcService should fail result futures if messages could not be sent. This would speed up the failure detection mechanism because it would not rely on the timeout. One way to achieve this could be to listen to the dead letters and then sending a Failure message back to the sender.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            trohrmann Till Rohrmann
            trohrmann Till Rohrmann
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment