Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-20081

Implement "weakSend" properly, add "weakInvoke"

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      There was an idea. Some components, like RAFT, are allowed to lose messages. Having strict guarantees for messages delivery may not be good for such components.

      But, current implementation of "weakSend" is just a wrapper around "send" that doesn't return any future. This API must be redesigned and properly implemented.

      API

      • CompletableFuture<Void> weakSend(ClusterNode recipient, NetworkMessage msg, long timeout);

      • CompletableFuture<NetworkMessage> weakInvoke(ClusterNode recipient, NetworkMessage msg, long timeout);

      Futures are being completed in two cases:

      • ack or response has been received
      • timeout is exceeded

      This means that huge timeout is probably a bad idea for such messages.

      Implementation

      • with stable and fast connection, weak communication should work the same way from the client standpoint;
      • if a message queue for the given connection is full, we may/should:
        • remove all weak messages from the existing queue, that 100% have not been sent;
        • reject new weak messages;
        • maybe throttle, but this is out of scope;
      • alternatively, if connection breaks, we may start removing weak messages from the queue, and/or rejecting new ones.

      Weak send and weak invoke may behave differently.

      For example, "weakSend" requires ack, so it has to be marked with a "message number" in recovery descriptor.
      But, "weakInvoke" doesn't need an ack, it only requires a response (already has "correlationId"), so "not re-sending" it after reconnect shouldn't break the recovery protocol. It doesn't need to have a "message number" in a recovery descriptor, we can save some resources by reducing the number of acks.

      One more important thing:

      • when invoke future fails with timeout exception, we must cleanup corresponding correlation ID from the map;
      • when we receive "node left" event for some node, we should complete all returned futures with some "NodeLeftException", and cleanup all its correlation IDs from the map as well.

      Integration

      will be done separately. All we need, for now, is a set of unit tests.

      Attachments

        Activity

          People

            Unassigned Unassigned
            ibessonov Ivan Bessonov
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: