[CASSANDRA-11933] Cache local ranges when calculating repair neighbors - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Normal
Resolution: Fixed
Fix Version/s: 2.1.15, 2.2.7, 3.0.8, 3.8
Component/s: Legacy/Core
Labels:
None

Description

During a full repair on a ~ 60 nodes cluster, I've been able to see that this stage can be significant (up to 60 percent of the whole time) :

https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997

It's merely caused by the fact that https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189 calls

ss.getLocalRanges(keyspaceName)

everytime and that it takes more than 99% of the time. This call takes 600ms when there is no load on the cluster and more if there is. So for 10k ranges, you can imagine that it takes at least 1.5 hours just to compute ranges.

Underneath it calls ReplicationStrategy.getAddressRanges which can get pretty inefficient (jbellis's words)

ss.getLocalRanges(keyspaceName) should be cached to avoid having to spend hours on it.

Attachments

Issue Links

relates to

CASSANDRA-3912 repair user provided custom token range (support incremental repair controlled by external agent)

Resolved

Activity

People

Assignee:: Mahdi Mohammadinasab

Reporter:: Cyril Scetbon

Authors:: Mahdi Mohammadinasab

Reviewers:: Paulo Motta

Votes:: 1 Vote for this issue

Watchers:: 14 Start watching this issue

Dates

Created:: 31/May/16 23:08

Updated:: 16/Apr/19 09:30

Resolved:: 15/Jun/16 09:50