Affects Version/s: None
Fix Version/s: None
Network partitioning can cause brian-split case where there are two leaders exist. We need some sort of testing Infrastructure/framework to simulate such case and verify whether our SCM HA implementation can achieve strong consistency under partitioned network.
There might be two ways suggested by Mukul Kumar Singh:
a) Blockade tests, blockade is a docker based framework where the
network for one DN can be isolated from the other
b) MiniOzoneChaosCluster - This is a unit test based test, where a
random datanode was killed and this helped in finding out issues with
We might need similar solution for SCM: block SCM leader network and also increase timeout to make old leader do not turn into candidate.