[CASSANDRA-19364] Data loss during decommission possible due to a delayed and unsynced pending ranges calculation - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Triage Needed
Priority: Normal
Resolution: Unresolved
Fix Version/s: None
Component/s: Consistency/Bootstrap and Decommission
Labels:
None

Platform:

All
Impacts:

None

Description

This possible issue has been discovered while inspecting flaky tests of ~~CASSANDRA-18824~~. Pending ranges calculation is executed asynchronously when the node is decommissioned. If the data is inserted during decommissioning, and pending ranges calculation is delayed for some reason (it can be as it is not synchronous), we may end up with partial data loss. That can be just a wrong test. Thus, I perceive this ticket more like a memo for further investigation or discussion.

Note that this has obviously been fixed by TCM.

The test in question was:

        try (Cluster cluster = init(builder().withNodes(2)
                                             .withTokenSupplier(evenlyDistributedTokens(2))
                                             .withNodeIdTopology(NetworkTopology.singleDcNetworkTopology(2, "dc0", "rack0"))
                                             .withConfig(config -> config.with(NETWORK, GOSSIP))
                                             .start(), 1))
        {
            IInvokableInstance nodeToDecommission = cluster.get(1);
            IInvokableInstance nodeToRemainInCluster = cluster.get(2);

            // Start decomission on nodeToDecommission
            cluster.forEach(statusToDecommission(nodeToDecommission));
            logger.info("Decommissioning node {}", nodeToDecommission.broadcastAddress());

            // Add data to cluster while node is decomissioning
            int numRows = 100;
            cluster.schemaChange("CREATE TABLE IF NOT EXISTS " + KEYSPACE + ".tbl (pk int, ck int, v int, PRIMARY KEY (pk, ck))");
            insertData(cluster, 1, numRows, ConsistencyLevel.ONE); // <------------------- HERE - when PRC is delayed, we get there only ~50% of inserted rows

            // Check data before cleanup on nodeToRemainInCluster
            assertEquals(100, nodeToRemainInCluster.executeInternal("SELECT * FROM " + KEYSPACE + ".tbl").length);
    }

Attachments

Issue Links

Discovered while testing

CASSANDRA-18824 Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Jacek Lewandowski

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 05/Feb/24 10:59

Updated:: 05/Feb/24 11:02