[CASSANDRA-11748] Schema version mismatch may leads to Casandra OOM at bootstrap during a rolling upgrade process - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Urgent
Resolution: Unresolved
Fix Version/s: 3.0.x, 3.11.x, 5.x
Component/s: Legacy/Core
Labels:
None
Environment:

Rolling upgrade process from 1.2.19 to 2.0.17.
CentOS 6.6
Occurred in different C* node of different scale of deployment (2G ~ 5G)

Severity:
Critical
Since Version:

2.0.17

Description

We have observed multiple times when a multi-node C* (v2.0.17) cluster ran into OOM in bootstrap during a rolling upgrade process from 1.2.19 to 2.0.17.

Here is the simple guideline of our rolling upgrade process
1. Update schema on a node, and wait until all nodes to be in schema version agreemnt - via nodetool describeclulster
2. Restart a Cassandra node
3. After restart, there is a chance that the the restarted node has different schema version.
4. All nodes in cluster start to rapidly exchange schema information, and any of node could run into OOM.

The following is the system.log that occur in one of our 2-node cluster test bed
----------------------------------
Before rebooting node 2:
Node 1: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,326 MigrationManager.java (line 328) Gossiping my schema version 4cb463f8-5376-3baf-8e88-a5cc6a94f58f

Node 2: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,122 MigrationManager.java (line 328) Gossiping my schema version 4cb463f8-5376-3baf-8e88-a5cc6a94f58f

After rebooting node 2,
Node 2: DEBUG [main] 2016-04-19 11:18:18,016 MigrationManager.java (line 328) Gossiping my schema version f5270873-ba1f-39c7-ab2e-a86db868b09b

The node2 keeps submitting the migration task over 100+ times to the other node.
INFO [GossipStage:1] 2016-04-19 11:18:18,261 Gossiper.java (line 1011) Node /192.168.88.33 has restarted, now UP
INFO [GossipStage:1] 2016-04-19 11:18:18,262 TokenMetadata.java (line 414) Updating topology for /192.168.88.33
...
DEBUG [GossipStage:1] 2016-04-19 11:18:18,265 MigrationManager.java (line 102) Submitting migration task for /192.168.88.33
... ( over 100+ times)
----------------------------------
On the otherhand, Node 1 keeps updating its gossip information, followed by receiving and submitting migrationTask afterwards:

INFO [RequestResponseStage:3] 2016-04-19 11:18:18,333 Gossiper.java (line 978) InetAddress /192.168.88.34 is now UP
...
DEBUG [MigrationStage:1] 2016-04-19 11:18:18,496 MigrationRequestVerbHandler.java (line 41) Received migration request from /192.168.88.34.
…… ( over 100+ times)
DEBUG [OptionalTasks:1] 2016-04-19 11:19:18,337 MigrationManager.java (line 127) submitting migration task for /192.168.88.34
..... (over 50+ times)

On the side note, we have over 200+ column families defined in Cassandra database, which may related to this amount of rpc traffic.

P.S.2 The over requested schema migration task will eventually have InternalResponseStage performing schema merge operation. Since this operation requires a compaction for each merge and is much slower to consume. Thus, the back-pressure of incoming schema migration content objects consumes all of the heap space and ultimately ends up OOM!

Attachments

Issue Links

is duplicated by

CASSANDRA-14840 Bootstrap of new node fails with OOM in a large cluster

Open

relates to

CASSANDRA-15158 Wait for schema agreement rather than in flight schema requests when bootstrapping

Resolved

CASSANDRA-13569 Schedule schema pulls just once per endpoint

Open

Activity

People

Assignee:: Unassigned

Reporter:: Michael Fong

Reviewers:: Aleksey Yeschenko

Votes:: 0 Vote for this issue

Watchers:: 15 Start watching this issue

Dates

Created:: 11/May/16 03:27

Updated:: 07/Mar/23 10:54