Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-23202

RpcService should fail result futures if messages could not be sent

    XMLWordPrintableJSON

Details

    • Hide
      The same way Flink detects unreachable heartbeat targets faster, Flink now also immediately fails RPCs where the target is known by the OS to be unreachable on a network level, instead of waiting for a timeout (akka.ask.timeout).

      One effect this are faster task failovers, because cancelling tasks on a dead TaskExecutor no longer gets delayed by the RPC timeout.

      If this faster failover is a problem in certain setups (which might rely on the fast that external systems hit timeouts), we recommend to configure the application's restart strategy with a restart delay.
      Show
      The same way Flink detects unreachable heartbeat targets faster, Flink now also immediately fails RPCs where the target is known by the OS to be unreachable on a network level, instead of waiting for a timeout (akka.ask.timeout). One effect this are faster task failovers, because cancelling tasks on a dead TaskExecutor no longer gets delayed by the RPC timeout. If this faster failover is a problem in certain setups (which might rely on the fast that external systems hit timeouts), we recommend to configure the application's restart strategy with a restart delay.

    Description

      The RpcService should fail result futures if messages could not be sent. This would speed up the failure detection mechanism because it would not rely on the timeout. One way to achieve this could be to listen to the dead letters and then sending a Failure message back to the sender.

      Attachments

        Issue Links

          Activity

            People

              trohrmann Till Rohrmann
              trohrmann Till Rohrmann
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: