The patch is trying to do 2 things:
1. Before, when we read, say quorum read, let RF = 3 (replica1, replica2, replica3), so the client request is trying to read from 2 replicas (replica1, replica2), but there is a digest mismatch between these 2 replicas, so read repair will kick in. Let's say the stale data is in replica2, read repair will send the correct data to replica2. But for some reason, the write request got timeout, then we send "read timeout " to client side.
After this patch, we will wait for replica2 write for some time, if it didn't come back, correct data is sent to replica3 no matter whether replica3 already has latest data or not. Because we know if replica3 write succeeds, it's guaranteed 2 replicas got the correct data, client will return success with data for read request, and next time the quorum read will definitely read correct data.
2. The second thing this patch is trying to do is to make sure in read repair part, we don't block for replicas beyond what is needed for consistency level to reply back in speculative retry/read repair chance case. For example, we still use above RF = 3 quorum read case, it's trying to read from replica1 and replica2, but replica2 is slow, then speculative retry kicks in, read will try to read replica3, then all 3 replicas read come back, but there is digest mismatch, both replica2 and replica3 are stale data, what happens before is read repair will block for both replica2 and replica3 to finish read repair, but there is no need to wait for both to come back, we only need to wait for one repair to come back since we only need one successful repair to guarantee successful quorum read. And next quorum read will definitely read latest data even replica 3 read repair failed. This is applied same to read repiar chance. Let's say the read repair chance is "GLOBAL", we don't need to block for all replicas to finish repair, we only need to block what the read consistency level needs.