Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-3200

Repair: compare all trees together (for a given range/cf) instead of by pair in isolation

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Low
    • Resolution: Fixed
    • 4.0-alpha1, 4.0
    • None

    Description

      Currently, repair compare merkle trees by pair, in isolation of any other tree. What that means concretely is that if I have three node A, B and C (RF=3) with A and B in sync, but C having some range r inconsitent with both A and B (since those are consistent), we will do the following transfer of r: A -> C, C -> A, B -> C, C -> B.

      The fact that we do both A -> C and C -> A is fine, because we cannot know which one is more to date from A or C. However, the transfer B -> C is useless provided we do A -> C if A and B are in sync. Not doing that transfer will be a 25% improvement in that case. With RF=5 and only one node inconsistent with all the others, that almost a 40% improvement, etc...

      Given that this situation of one node not in sync while the others are is probably fairly common (one node died so it is behind), this could be a fair improvement over what is transferred. In the case where we use repair to rebuild completely a node, this will be a dramatic improvement, because it will avoid the rebuilded node to get RF times the data it should get.

      Attachments

        Issue Links

          Activity

            People

              marcuse Marcus Eriksson
              slebresne Sylvain Lebresne
              Marcus Eriksson
              Blake Eggleston
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: