Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-9679

Document critical workers liveness checking implementation

    XMLWordPrintableJSON

Details

    • Task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 2.7
    • documentation
    • None

    Description

      Newly implemented critical worker thread liveness checks should be mentioned in Ignite Documentation. Brief description of the functionality follows.

      Ignite node has a number of critical worker threads that should be alive and responsive, otherwise node's health is not guaranteed. These threads monitor each other periodically and track two aspects for a thread being checked:

      • whether it's alive;
      • whether it updates its internal heartbeat timestamp.
        Whenever at least one of the above conditions is violated, checker thread logs the error and calls currently configured FailureHandler.

      IgniteConfiguration.SystemWorkerBlockedTimeout configuration property affects monitoring behavior. At runtime monitoring settings can be changed via FailureHandlingMxBean.

      By default, liveness checks are enabled, but blocked system worker detection will not lead to failure handler invocation, see FailureProcessor#getDefaultFailureHandler .

      Attachments

        Issue Links

          Activity

            People

              dmagda Denis A. Magda
              andrey-kuznetsov Andrey Kuznetsov
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: