[CASSANDRA-12888] Incremental repairs broken for MVs and CDC - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Normal
Resolution: Unresolved
Fix Version/s: 3.0.x, 3.11.x
Component/s: Feature/Materialized Views, Legacy/Streaming and Messaging
Labels:
- repair

Severity:
Normal

Description

SSTables streamed during the repair process will first be written locally and afterwards either simply added to the pool of existing sstables or, in case of existing MVs or active CDC, replayed on mutation basis:

As described in StreamReceiveTask.OnCompletionRunnable:

We have a special path for views and for CDC.

For views, since the view requires cleaning up any pre-existing state, we must put all partitions through the same write path as normal mutations. This also ensures any 2is are also updated.

For CDC-enabled tables, we want to ensure that the mutations are run through the CommitLog so they can be archived by the CDC process on discard.

Using the regular write path turns out to be an issue for incremental repairs, as we loose the repaired_at state in the process. Eventually the streamed rows will end up in the unrepaired set, in contrast to the rows on the sender site moved to the repaired set. The next repair run will stream the same data back again, causing rows to bounce on and on between nodes on each repair.

See linked dtest on steps to reproduce. An example for reproducing this manually using ccm can be found here

Attachments

Issue Links

is a child of

CASSANDRA-15921 4.0 quality testing: Materialized View

Open

is depended upon by

CASSANDRA-13290 Optimizing very small repair streams

Open

CASSANDRA-12985 Update MV repair documentation

Resolved

is related to

CASSANDRA-12730 Thousands of empty SSTables created during repair - TMOF death

Open

CASSANDRA-12489 consecutive repairs of same range always finds 'out of sync' in sane cluster

Open

CASSANDRA-12905 Retry acquire MV lock on failure instead of throwing WTE on streaming

Resolved

CASSANDRA-6477 Materialized Views (was: Global Indexes)

Resolved

CASSANDRA-16375 Prevent and fail-fast any attempts to incremental repair cdc/mv tables

Triage Needed

links to

dtest

(3 is related to, 1 links to)

Activity

People

Assignee:: Benjamin Roth

Reporter:: Stefan Podkowinski

Authors:: Benjamin Roth

Reviewers:: Paulo Motta

Votes:: 2 Vote for this issue

Watchers:: 32 Start watching this issue

Dates

Created:: 09/Nov/16 14:49

Updated:: 11/Jan/21 08:43