Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Won't Fix
-
None
-
None
Description
With OAK-759 we have introduced replica support in MongoMK. I think we still need to address the resilience for failover from primary to secoandary:
Consider a case where Oak writes to the primary. Replication to secondary is ongoing. During that period the primary goes down and the secondary becomes primary. There could be some "half-replicated" MVCC revisions, which need to be either discarded or be ignored after the failover.
This might not be an issue if there is only one shard, as the commit root is written last (and replicated last)
But with 2 shards the the replication state of these 2 shards could be inconsistent. Oak needs to handle such a situation without falling over.
If we can detect a Mongo failover we could query Mongo which revisions are fully replicated to the new primary and discard the potentially half-replicated revisions.