Details
-
Task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Newly implemented critical worker thread liveness checks should be mentioned in Ignite Documentation. Brief description of the functionality follows.
Ignite node has a number of critical worker threads that should be alive and responsive, otherwise node's health is not guaranteed. These threads monitor each other periodically and track two aspects for a thread being checked:
- whether it's alive;
- whether it updates its internal heartbeat timestamp.
Whenever at least one of the above conditions is violated, checker thread logs the error and calls currently configured FailureHandler.
IgniteConfiguration.SystemWorkerBlockedTimeout configuration property affects monitoring behavior. At runtime monitoring settings can be changed via FailureHandlingMxBean.
By default, liveness checks are enabled, but blocked system worker detection will not lead to failure handler invocation, see FailureProcessor#getDefaultFailureHandler .
Attachments
Issue Links
- is blocked by
-
IGNITE-9737 Ignite WatchDog service should be configurable
- Resolved
- is related to
-
IGNITE-6587 Ignite watchdog service
- Resolved