Initially, a request will be fully synchronous and will not return success to the client until the request is sent to each replica. So if a leader goes down before all replicas receive and ACK the request, the client will not get an ACK. A new leader will be elected. When the downed, previous leader comes back, he will come up in recovery mode. I expect recovery to be a difficult part and we have not fully worked it out yet. To recover, the node will have to talk to the leader and figure out what it has that it should not, what it doesn't have, etc. Then the recovering node either receives replays, or replaces the entire index. Lot's of details to work out here.
You have an interesting problem in that some replica leader candidates may have an update while others don't, as the leader may have died in the middle of relaying requests. We might prefer a new leader with the greatest versioned doc? Most client retries in this case will be fine (global unique id's are required, so no worry about dupes). Then replicas talk to the leader and sync up. Or when a new leader is elected, replicas just talk amongst each other and sync up, or…
If the leader fails right before sending an ACK, the client will likely repeat the request. In the case of doc adds/updates and the same id it will just replace the previous success or will be able to use optimistic locking to figure out that either its update or someone else's actually went through already. The client would already know that perhaps its update went through because the connection would have timed out rather than receive a failure.
Eventually, we might consider a mode where the request is ACK'd before it's on all replicas, in which case you might accept a higher risk of data loss.
indexes diverge because some replicas commit a change while others do not
It's an area we have not fully worked out (though Yonik has likely thought about a lot of this more than I have yet) - initially though, Yonik's point was that you can usually expect success on all nodes unless the issue is something that would require the node come down and then come back in recovery mode I think. We certainly want to be resilient here eventually though. As we work through recovery scenarios, I think this will become more clear.
Long, short, we have been discussing and thinking about these various scenarios, but largely we are also taking things an issue at a time.