Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-3199

Counter write protocol: have the coordinator (instead of first replica) waits for replica responses directly



    • Improvement
    • Status: Resolved
    • Low
    • Resolution: Won't Fix
    • None
    • None


      Current counter write protocol is this (where we take the case of write coordinator != first replica):

      1. coordinator forward write request to first replica
      2. first replica write locally and replicate to other replica
      3. first replica waits for enough answers from the other replica to satisfy the consistency level
      4. first replica acks the coordinator that completes the write to the client

      This ticket proposes to modify this protocol to:

      1. coordinator forward write request to first replica
      2. first replica write locally, acks the coordinator for its own write and replicate to other replica
      3. other replica respond directly to coordinator
      4. once coordinator has enough responses, it completes the write

      I see 2 advantages to this new protocol:

      • it should be at tad faster since it parallelizes wire transfer better
      • it woud make TimeoutException a bit less likely and more importantly, a TimeoutException would much more likely mean that the write hasn't been persisted. Indeed, in the current protocol, once the first replica has send the write to the other replica, it has to wait for the replica answers and answer the coordinator. If it dies during that time, we will return a TimeoutException, even though the first replica died after having done it's main job.

      The cons is that this adds a bit of complexity. In particular, the "other replica" would have to answer to the coordinator for a query that has been issued by the first replica.




            Unassigned Unassigned
            slebresne Sylvain Lebresne
            0 Vote for this issue
            1 Start watching this issue