Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-15442

Read repair implicitly increases read timeout value

    XMLWordPrintableJSON

    Details

      Description

      When read repair occurs during a read, internally, it starts several blocking operations in sequence. See org.apache.cassandra.service.StorageProxy#fetchRows
      The timeline of the blocking operations

      1. Regular read, wait for full data/digest read response to complete. reads[*].awaitResponses();
      2. Read repair read, wait for full data read response to complete. reads[*].awaitReadRepair();
      3. Read repair write, wait for write response to complete. concatAndBlockOnRepair(results, repairs);

      Step 1 and 2 share the same timeout, and wait for the duration of read timeout, say 5 s.
      Step 3 waits for the duration of write timeout, say 2 s.
      In the worse case, the actual time taken for a read could accumulate to ~7 s, if each individual step does not exceed the timeout value.
      From the client perspective, it may not expect a request taken higher than the database configured timeout value. 
      Such scenario is especially bad for the clients that have set up client-side timeout monitoring close to the configured one. The clients think the operations timed out and abort, but they are in fact still running on server.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                yifanc Yifan Cai
                Reporter:
                yifanc Yifan Cai
                Authors:
                Yifan Cai
                Reviewers:
                Blake Eggleston, Jordan West
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m