[CASSANDRA-14297] Startup checker should wait for count rather than percentage - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Low
Resolution: Fixed
Fix Version/s: 4.0-alpha1, 4.0
Component/s: Local/Startup and Shutdown
Labels:
- pull-request-available

Severity:
Low

Description

As I commented in ~~CASSANDRA-13993~~, the current wait for functionality is a great step in the right direction, but I don't think that the current setting (70% of nodes in the cluster) is the right configuration option. First I think this because 70% will not protect against errors as if you wait for 70% of the cluster you could still very easily have UnavailableException or ReadTimeoutException exceptions. This is because if you have even two nodes down in different racks in a Cassandra cluster these exceptions are possible (or with the default num_tokens setting of 256 it is basically guaranteed). Second I think this option is not easy for operators to set, the only setting I could think of that would "just work" is 100%.

I proposed in that ticket instead of having `block_for_peers_percentage` defaulting to 70%, we instead have `block_for_peers` as a count of nodes that are allowed to be down before the starting node makes itself available as a coordinator. Of course, we would still have the timeout to limit startup time and deal with really extreme situations (whole datacenters down etc).

I started working on a patch for this change on github, and am happy to finish it up with unit tests and such if someone can review/commit it (maybe aweisberg?).

I think the short version of my proposal is we replace:

block_for_peers_percentage: <percentage needed up, defaults to 70%>

with either

block_for_peers: <number that can be down, defaults to 1>

or, if we want to do even better imo and enable advanced operators to finely tune this behavior (while still having good defaults that work for almost everyone):

block_for_peers_local_dc:  <number that can be down, defaults to 1>
block_for_peers_each_dc: <number that can be down, defaults to sys.maxint>
block_for_peers_all_dcs: <number that can be down, defaults to sys.maxint>

For example if an operator knows that they must be available at LOCAL_QUORUM they would set block_for_peers_local_dc=1, if they use EACH_QUOURM they would set block_for_peers_local_dc=1, if they use QUORUM (RF=3, dcs=2) they would set block_for_peers_all_dcs=2. Naturally everything would of course have a timeout to prevent startup taking too long.

Attachments

Issue Links

is related to

CASSANDRA-18968 StartupClusterConnectivityChecker fails on upgrade from 3.X

Resolved

CASSANDRA-13993 Add optional startup delay to wait until peers are ready

Resolved

CASSANDRA-14001 Gossip after node restart can take a long time to converge about "down" nodes in large clusters

Resolved

links to

GitHub Pull Request #212

Activity

People

Assignee:: Joey Lynch

Reporter:: Joey Lynch

Authors:: Joey Lynch

Reviewers:: Ariel Weisberg

Votes:: 1 Vote for this issue

Watchers:: 12 Start watching this issue

Dates

Created:: 07/Mar/18 22:52

Updated:: 27/Oct/23 16:47

Resolved:: 12/Nov/18 17:47

Time Tracking

Estimated:

Not Specified

Remaining:

Logged: