Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-11933

Cache local ranges when calculating repair neighbors

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Normal
    • Resolution: Fixed
    • Fix Version/s: 2.1.15, 2.2.7, 3.0.8, 3.8
    • Component/s: Legacy/Core
    • Labels:
      None

      Description

      During a full repair on a ~ 60 nodes cluster, I've been able to see that this stage can be significant (up to 60 percent of the whole time) :

      https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997

      It's merely caused by the fact that https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189 calls

      ss.getLocalRanges(keyspaceName)

      everytime and that it takes more than 99% of the time. This call takes 600ms when there is no load on the cluster and more if there is. So for 10k ranges, you can imagine that it takes at least 1.5 hours just to compute ranges.

      Underneath it calls ReplicationStrategy.getAddressRanges which can get pretty inefficient (Jonathan Ellis's words)

      ss.getLocalRanges(keyspaceName) should be cached to avoid having to spend hours on it.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                mahdix Mahdi Mohammadinasab
                Reporter:
                cscetbon Cyril Scetbon
                Authors:
                Mahdi Mohammadinasab
                Reviewers:
                Paulo Motta
              • Votes:
                1 Vote for this issue
                Watchers:
                14 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: