Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.8.1
-
None
-
Fixed processing of failure detection timeout in TcpDiscoverySpi. If a node fails to send a message or ping, now it drops current connection strictly within this timeout and begins establishing new connection much faster.
-
Release Notes Required
Description
Connection failure may not be detected within IgniteConfiguration.failureDetectionTimeout. Actual worst delay is: ServerImpl.CON_CHECK_INTERVAL + IgniteConfiguration.failureDetectionTimeout. Node ping routine is duplicated.
We should fix:
1. Failure detection timeout should take in account last sent message. Current ping is bound to own time:
ServerImpl. RingMessageWorker.lastTimeConnCheckMsgSent
This is weird because any discovery message check connection.
2. Make connection check interval depend on failure detection timeout (FTD). Current value is a constant:
static int ServerImpls.CON_CHECK_INTERVAL = 500
3. Remove additional, quickened connection checking. Once we do fix 1, this will become even more useless.
Despite TCP discovery has a period of connection checking, it may send ping before this period exhausts. This premature ping relies also on the time of any received message for some reason.
4. Do not worry user with “Node seems disconnected” when everything is OK. Once we do fix 1 and 3, this will become even more useless.
Node may log on INFO: “Local node seems to be disconnected from topology …” whereas it is not actually disconnected at all.
Attachments
Attachments
Issue Links
- causes
-
IGNITE-13206 Represent in the documenttion affection of several node addresses on failure detection.
- Closed
- is depended upon by
-
IGNITE-13016 Fix backward checking of failed node.
- Resolved
-
IGNITE-13015 Use nano time in node failure detection.
- Resolved
-
IGNITE-13205 Represent in logs, javadoc affection of several node addresses on failure detection.
- Resolved
-
IGNITE-13134 Fix connection recovery timeout.
- Closed
-
IGNITE-13090 Add parameter of connection check period to TcpDiscoverySpi
- Closed
- links to