We expect that deleted elements not to be dequeued, which RabbitMQ MailQueue fails to do.
While working on this bug solving here https://github.com/linagora/james-project/pull/2445 , I realized one of the invariants we enforced on MailQueue does not hold.
The invariant is: "A single email is enqueued a single time in a given queue".
That invariant is broken for instance when:
- reprocessing an email: the name is not altered
- Recipient RewriteTable rewrites to a distant server
- RemoteDelivery bouncing configured with an SMTP gateway
RabbitMQ mail queue does need to leverage a Cassandra projection in order to offer MailQueue manageable capabilities (clear, delete, browse, size amongst others). We naturally chose the mail name, used everywhere to identify an email in order to build this projection.
This results in a given mail being enqueued a single time in a single queue: on the second enqueue, the mail will be referenced as already deleted from the queue and will be discarded.
In https://github.com/linagora/james-project/pull/2445 I scratched the surface by enforcing the aforementioned invariants where the right approach is to make the RabbitMQ mail queue correctly handle multiple emails enqueue with the same name. Which might involve table schema changes.
We need to assign email an EnQueueId that identifies them. This level of indirection allows to enqueue several time the same email.
- The breaking way
- Stop exposing James to JMAP& SMTP traffic and waits for the queue to empty
- Deploy the newest version (will create new empty tables)
- Re-enable traffic
In case the upgrade isdone carelessly, some email might be lost.
I don't think we can really be smarter here...