Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-3912

repair user provided custom token range (support incremental repair controlled by external agent)

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 1.1.1
    • None

    Description

      As a poor man's pre-cursor to CASSANDRA-2699, exposing the ability to repair small parts of a range is extremely useful because it allows (with external scripting logic) to slowly repair a node's content over time. Other than avoiding the bulkyness of complete repairs, it means that you can safely do repairs even if you absolutely cannot afford e.g. disk spaces spikes (see CASSANDRA-2699 for what the issues are).

      Attaching a patch that exposes a "repairincremental" command to nodetool, where you specify a step and the number of total steps. Incrementally performing a repair in 100 steps, for example, would be done by:

      nodetool repairincremental 0 100
      nodetool repairincremental 1 100
      ...
      nodetool repairincremental 99 100
      

      An external script can be used to keep track of what has been repaired and when. This should allow (1) allow incremental repair to happen now/soon, and (2) allow experimentation and evaluation for an implementation of CASSANDRA-2699 which I still think is a good idea. This patch does nothing to help the average deployment, but at least makes incremental repair possible given sufficient effort spent on external scripting.

      The big "no-no" about the patch is that it is entirely specific to RandomPartitioner and BigIntegerToken. If someone can suggest a way to implement this command generically using the Range/Token abstractions, I'd be happy to hear suggestions.

      An alternative would be to provide a nodetool command that allows you to simply specify the specific token ranges on the command line. It makes using it a bit more difficult, but would mean that it works for any partitioner and token type.

      Unless someone can suggest a better way to do this, I think I'll provide a patch that does this. I'm still leaning towards supporting the simple "step N out of M" form though.

      Attachments

        1. CASSANDRA-3912-v2-002-fix-antientropyservice.txt
          6 kB
          Peter Schuller
        2. CASSANDRA-3912-v2-001-add-nodetool-commands.txt
          9 kB
          Peter Schuller
        3. CASSANDRA-3912-trunk-v1.txt
          10 kB
          Peter Schuller
        4. 3912_v2.txt
          7 kB
          Sylvain Lebresne

        Issue Links

          Activity

            People

              scode Peter Schuller
              scode Peter Schuller
              Peter Schuller
              Stu Hood
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: