Mailbox conters inconsistencies requires manual intervention of the admin (via the solve-inconsistency webadmin endpoint) to be resolved.
Having such task to ensure a correct denormalisation state is desirable, but optimally we should not have to rely on the admin to remember running it.
We would prefer an auto-healing solution.
Mailbox counters consistency had been a blocker for some of our existing deployments.
Read repair is a classic mechanism in eventual consistent databases (like cassandra) to piggy back consistency healing operation upon reads.
A fraction of the reads will have a wider reach, read more data, and ensure it is correctly replicated.
Cassandra eventual consistency is all about "replication", but "denormalization" consistency needs to be handled at the applicative layer. In the past we did set up "Solve inconsistency" tasks that can be assimilated to Cassandra repairs. In order to achieve denormalization auto-healing, we thus needs to implement "applicative read repairs".
= Technically speaking
Upon reads, for each mailbox, the mailbox mapper have a configurable random chance to trigger read repairs.
In this case we reaqd the counters, and iterate counters metadata, and the "solve inconsistency" mechanism is applied, if needed.
Write unit tests demonstrating this "solve inconsistency" behavior.
Given that reading "all messages metadata" can be an expensive operation, we want to be running the repair asynchronously, in the background.
= Definition Of Done
Record a video : In a webmail ...
- Given a read repair chance of 0.2 (20%)
- There is a mailbox counter inconsistency inconsistency (10 mails unread in the mailbox but the counter display 26) - (use cqlsh to alter the value of the mailbox counter to something invalid)
- The user does several reads - after some time, the inconsistency is fixed: The count of unread mails is 10!
As an admin, I should be able to configure (or disable) the read-repair-chance for the mailbox entity