Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Incomplete
-
None
-
None
-
None
-
None
Description
ChaosMonkey on EC2 may occasionally send a kill command to remote RS/master, and succeed in killing it, but still get some strange exit code (like permission denied) and neglect to restart it. Needless to say it can negatively affect the test, especially when master is killed forever.
We need to add exception handling and check if the service is running to exception handling.