Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Won't Fix
-
None
-
None
-
None
Description
This idea has been bandied about a fair amount. The concept is to add a second java process that runs next to each region server to act as a watchdog. Several possible purposes:
- monitor the RS for liveness - if it exhibits Juliet syndrome ("appears dead") then we kill it agressively to prevent it from coming back to life
- restart RS automatically in failure cases
- potentially move the entire ZK session to the watchdog to decouple node liveness from the particular JVM liveness
Let's discuss in this JIRA.