Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-15442

Read repair implicitly increases read timeout value

Agile BoardAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      When read repair occurs during a read, internally, it starts several blocking operations in sequence. See org.apache.cassandra.service.StorageProxy#fetchRows
      The timeline of the blocking operations

      1. Regular read, wait for full data/digest read response to complete. reads[*].awaitResponses();
      2. Read repair read, wait for full data read response to complete. reads[*].awaitReadRepair();
      3. Read repair write, wait for write response to complete. concatAndBlockOnRepair(results, repairs);

      Step 1 and 2 share the same timeout, and wait for the duration of read timeout, say 5 s.
      Step 3 waits for the duration of write timeout, say 2 s.
      In the worse case, the actual time taken for a read could accumulate to ~7 s, if each individual step does not exceed the timeout value.
      From the client perspective, it may not expect a request taken higher than the database configured timeout value. 
      Such scenario is especially bad for the clients that have set up client-side timeout monitoring close to the configured one. The clients think the operations timed out and abort, but they are in fact still running on server.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            yifanc Yifan Cai Assign to me
            yifanc Yifan Cai
            Yifan Cai
            Blake Eggleston, Jordan West
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 10m
                10m

                Slack

                  Issue deployment