Details
-
Bug
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
Availability - Response Crash
-
Low
-
Normal
-
Code Inspection
-
All
-
None
-
Description
When read repair occurs during a read, internally, it starts several blocking operations in sequence. See org.apache.cassandra.service.StorageProxy#fetchRows.
The timeline of the blocking operations
- Regular read, wait for full data/digest read response to complete. reads[*].awaitResponses();
- Read repair read, wait for full data read response to complete. reads[*].awaitReadRepair();
- Read repair write, wait for write response to complete. concatAndBlockOnRepair(results, repairs);
Step 1 and 2 share the same timeout, and wait for the duration of read timeout, say 5 s.
Step 3 waits for the duration of write timeout, say 2 s.
In the worse case, the actual time taken for a read could accumulate to ~7 s, if each individual step does not exceed the timeout value.
From the client perspective, it may not expect a request taken higher than the database configured timeout value.
Such scenario is especially bad for the clients that have set up client-side timeout monitoring close to the configured one. The clients think the operations timed out and abort, but they are in fact still running on server.
Attachments
Issue Links
- relates to
-
CASSANDRA-2494 Quorum reads are not monotonically consistent
- Resolved
-
CASSANDRA-14635 Support table level configuration of monotonic reads
- Resolved
- links to