It is possible that a down replica gets elected as a leader, and that it stays down after the election.
Here's how I hit upon this:
- There are 3 replicas: leader, notleader0, notleader1
- Introduced network partition to isolate notleader0, notleader1 from leader (leader puts these two in LIR via zk).
- Kill leader, remove partition. Now leader is dead, and both of notleader0 and notleader1 are down. There is no leader.
- Remove LIR znodes in zk.
- Wait a while, and there happens a (flawed?) leader election.
- Finally, the state is such that one of notleader0 or notleader1 (which were down before) become leader, but stays down.