If two nodes are bootstrapped at the same time that own adjacent portions of the ring (i.e. always for nodes), they will not receive the correct data for pending writes (and perhaps not for streaming - TBC)
By default, we don't "permit" multiple nodes to bootstrap at once, but:
- The logic we use to prevent this itself isn’t strongly consistent (or atomically applied). If two nodes start bootstrapping close together in time, or simply get divergent gossip state, they can both believe there is no other node bootstrapping and proceed.
- The bug doesn’t require two nodes to actually bootstrap at the same time, there only needs to be divergent gossip state on a coordinator, so that the coordinator believes there are multiple bootstrapping, even though one of them may have completed, and they never overlapped in reality.
- We can bootstrap and remove nodes concurrently, I think? I’m pretty sure this can also be unsafe, but needs some more thought.