[CASSANDRA-10413] Replaying materialized view updates from commitlog after node decommission crashes Cassandra - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Urgent
Resolution: Fixed
Fix Version/s: 3.0.0 rc2
Component/s: Feature/Materialized Views, Legacy/Coordination
Labels:
None

Severity:
Critical

Description

This issue is reproducible through a Jepsen test, runnable as

lein with-profile +trunk test :only cassandra.mv-test/mv-crash-subset-decommission

This test crashes/restarts nodes while decommissioning nodes. These actions are not coordinated.

In 10164, we introduced a change to re-apply materialized view updates on commitlog replay.

Some nodes, upon restart, will crash in commitlog replay. They throw the "Trying to get the view natural endpoint on a non-data replica" runtime exception in getViewNaturalEndpoint. I added logging to getViewNaturalEndpoint to show the results of replicationStrategy.getNaturalEndpoints for the baseToken and viewToken.

It can be seen that these problems occur when the baseEndpoints and viewEndpoints are identical but do not contain the broadcast address of the local node.

For example, a node at 10.0.0.5 crashes on replay of a write whose base token and view token replicas are both [10.0.0.2, 10.0.0.4, 10.0.0.6]. It seems we try to guard against this by considering pendingEndpoints for the viewToken, but this does not appear to be sufficient.

I've attached the system.logs for a test run with added logging. In the attached logs, n1 is at 10.0.0.2, n2 is at 10.0.0.3, and so on. 10.0.0.6/n5 is the decommissioned node.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

n1.log
29/Sep/15 18:11
1.15 MB
Joel Knighton
n2.log
29/Sep/15 18:11
664 kB
Joel Knighton
n3.log
29/Sep/15 18:11
670 kB
Joel Knighton
n4.log
29/Sep/15 18:11
799 kB
Joel Knighton
n5.log
29/Sep/15 18:11
268 kB
Joel Knighton

Activity

People

Assignee:: Joel Knighton

Reporter:: Joel Knighton

Authors:: Joel Knighton

Reviewers:: T Jake Luciani

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 29/Sep/15 18:08

Updated:: 16/Apr/19 09:30

Resolved:: 07/Oct/15 19:48