Fix Version/s: 4.0
We currently send an anticompaction request to all replicas. During this, a node will split stables and mark the appropriate ones repaired.
The problem is that this could fail on some replicas due to many reasons leading to problems in the next repair.
This is what I am suggesting to improve it.
1) Send anticompaction request to all replicas. This can be done at session level.
2) During anticompaction, stables are split but not marked repaired.
3) When we get positive ack from all replicas, coordinator will send another message called markRepaired.
4) On getting this message, replicas will mark the appropriate stables as repaired.
This will reduce the window of failure. We can also think of "hinting" markRepaired message if required.
Also the stables which are streaming can be marked as repaired like it is done now.