Description
Why?
As described in JAMES-3605, RabbitMQ consumers can end up being stuck.
In such a scenario we are not able to detect the failure and end up relying on cunsumer reports to restart James.
We would like to have a health-check, allowing automatic testing about mail reception. (Both periodical logs and also a HTTP endpoint for integration with a monitoring system)...
So a healthcheck no longer for a technical component but for a feature.
How?
In `healthcheck.properties` on could configure a user to run reception checks:
reception.checks.user=test@domain.tld
If configured, the healthcheck would then send a mail to the user, await the email via the eventbus, then retrieve its content.
To be placed in a new `server/container/feature-checks` maven project.
Definition of done
- Given a paused RabbitMQ the check fails for the distributed server
- Given a working James the check passes.
Disclaimer
The idea had been first expressed by matthieu years ago...
Attachments
Issue Links
- links to