Description
I’m working on a mechanism to disable health checks for a given instance of a job in order to preserve state for collecting diagnostic information. Some diagnostic tasks such as collecting heap dumps may cause the shard to start failing health checks, as a result vital diagnostic information is destroyed.
The process would consist of touching a file in the sandbox that signals the health check to ignore the state of a given instance temporarily.