[CASSANDRA-13975] Add a workaround for overly large read repair mutations - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Normal
Resolution: Fixed
Fix Version/s: 3.0.16, 3.11.2
Component/s: Legacy/Coordination
Labels:
None

Severity:
Normal

Description

It's currently possible for DataResolver to accumulate more changes to read repair that would fit in a single serialized mutation. If that happens, the node receiving the mutation would fail, and the read would time out, and won't be able to proceed until the operator runs repair or manually drops the affected partitions.

Ideally we should either read repair iteratively, or at least split the resulting mutation into smaller chunks in the end. In the meantime, for 3.0.x, I suggest we add logging to catch this, and a -D flag to allow proceeding with the requests as is when the mutation is too large, without read repair.

Attachments

Activity

People

Assignee:: Aleksey Yeschenko

Reporter:: Aleksey Yeschenko

Authors:: Aleksey Yeschenko

Reviewers:: Sam Tunnicliffe

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 25/Oct/17 18:09

Updated:: 20/May/21 20:50

Resolved:: 13/Nov/17 13:21