Cassandra
  1. Cassandra
  2. CASSANDRA-2606

Expose through JMX the ability to repair only the primary range

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 1.0.0
    • Component/s: Core
    • Labels:

      Description

      CASSANDRA-2324 introduces the ability to do a repair only on a given range. This ticket proposes to add a nodetool repairPrimaryRange to trigger the repair of only the primary range of a node. This allows to repair a full cluster without any work duplication (or at least make it much simpler). This also introdude a global_repair command to clustertool to trigger the primary range repair on each node of the cluster one after another (we could do all in parallel, but that's probably not nice on the cluster).

        Issue Links

          Activity

          Hide
          Sylvain Lebresne added a comment -

          Patch attached. I'm targeting 0.8 because it's really about exposing what's already in the code and the chance to screw up existing code is inexistant.

          Show
          Sylvain Lebresne added a comment - Patch attached. I'm targeting 0.8 because it's really about exposing what's already in the code and the chance to screw up existing code is inexistant.
          Hide
          Jonathan Ellis added a comment - - edited

          I think I'd rather have it be an option to repair (--primary-range-only) than a new command.

          -1 on doing anything more with clustertool than putting it out of its misery (CASSANDRA-2607).

          How does "primary range repair" work with NTS, where ranges are unique to each DC?

          Show
          Jonathan Ellis added a comment - - edited I think I'd rather have it be an option to repair (--primary-range-only) than a new command. -1 on doing anything more with clustertool than putting it out of its misery ( CASSANDRA-2607 ). How does "primary range repair" work with NTS, where ranges are unique to each DC?
          Hide
          Sylvain Lebresne added a comment -

          I think I'd rather have it be an option to repair (--primary-range-only) than a new command.

          -1 on doing anything more with clustertool than putting it out of its misery (CASSANDRA-2607).

          Fair enough, v2 does both of those thing.

          How does "primary range repair" work with NTS, where ranges are unique to each DC?

          Not sure what you mean by "ranges are unique to each DC". Even with NTS, the
          primary ranges are still computed over the one full ring (that rule them all).
          So picking the primary range still has the property we are interested in here,
          that is "if you repair the primary range of all the node of the full
          cluster, then you will have repaired the full ring without doing any work
          twice".

          Now it is true that since with NTS you compute your token assignment
          separately in each DC you may end up with some node having tiny primary range
          (for instance if you have the same number of nodes in each DC and only offset
          the tokens by 1 across DC). In which case the repair on those nodes will be
          very quick. But I don't think this is a problem in any way (every host will
          still do roughly the same amount of work overall).

          However, maybe what you meant is that "repair is not optimized for multi-DC
          settings". Which is a very good remark but is really a whole new problem. I'll
          open a ticket for that.

          Show
          Sylvain Lebresne added a comment - I think I'd rather have it be an option to repair (--primary-range-only) than a new command. -1 on doing anything more with clustertool than putting it out of its misery ( CASSANDRA-2607 ). Fair enough, v2 does both of those thing. How does "primary range repair" work with NTS, where ranges are unique to each DC? Not sure what you mean by "ranges are unique to each DC". Even with NTS, the primary ranges are still computed over the one full ring (that rule them all). So picking the primary range still has the property we are interested in here, that is "if you repair the primary range of all the node of the full cluster, then you will have repaired the full ring without doing any work twice". Now it is true that since with NTS you compute your token assignment separately in each DC you may end up with some node having tiny primary range (for instance if you have the same number of nodes in each DC and only offset the tokens by 1 across DC). In which case the repair on those nodes will be very quick. But I don't think this is a problem in any way (every host will still do roughly the same amount of work overall). However, maybe what you meant is that "repair is not optimized for multi-DC settings". Which is a very good remark but is really a whole new problem. I'll open a ticket for that.
          Hide
          Jonathan Ellis added a comment -

          +1 v2

          Show
          Jonathan Ellis added a comment - +1 v2
          Hide
          Sylvain Lebresne added a comment -

          Wait a minute ... that doesn't really work

          Or more precisely, repairing a given range on a node N only make sure that N has this range up to date. The other replica for this range may not be fully up to date on that range. Meaning that running repair on the primary range for all node won't result in a fully repaired ring.

          Actually I think the right fix would be to make it so that repairing a range really repair this range on all the replica for this range, independently of which node we connected to to start the repair. I've created CASSANDRA-2610 for that. So sorry about that but I think we should wait on that last ticket before committing this.

          Show
          Sylvain Lebresne added a comment - Wait a minute ... that doesn't really work Or more precisely, repairing a given range on a node N only make sure that N has this range up to date. The other replica for this range may not be fully up to date on that range. Meaning that running repair on the primary range for all node won't result in a fully repaired ring. Actually I think the right fix would be to make it so that repairing a range really repair this range on all the replica for this range, independently of which node we connected to to start the repair. I've created CASSANDRA-2610 for that. So sorry about that but I think we should wait on that last ticket before committing this.
          Hide
          Sylvain Lebresne added a comment -

          Rebased patch now that CASSANDRA-2610 has been committed.

          Show
          Sylvain Lebresne added a comment - Rebased patch now that CASSANDRA-2610 has been committed.
          Hide
          Jonathan Ellis added a comment -

          +1

          nit: should we call it the "partitioner range" or something else b/s "primary," which has connotations that we'd prefer to avoid?

          Show
          Jonathan Ellis added a comment - +1 nit: should we call it the "partitioner range" or something else b/s "primary," which has connotations that we'd prefer to avoid?
          Hide
          Sylvain Lebresne added a comment -

          nit: should we call it the "partitioner range" or something else b/s "primary," which has connotations that we'd prefer to avoid?

          Agreed, I was thinking the same thing. Not sure 'partitioner range' is really meaningful but I agree that it'll avoid confusion. I was also thinking of 'local range' as an alternative, but it's not really more meaningful, just shorter.

          Show
          Sylvain Lebresne added a comment - nit: should we call it the "partitioner range" or something else b/s "primary," which has connotations that we'd prefer to avoid? Agreed, I was thinking the same thing. Not sure 'partitioner range' is really meaningful but I agree that it'll avoid confusion. I was also thinking of 'local range' as an alternative, but it's not really more meaningful, just shorter.
          Hide
          Sylvain Lebresne added a comment -

          Committed with the 'partitioner range' fix

          Show
          Sylvain Lebresne added a comment - Committed with the 'partitioner range' fix
          Hide
          Hudson added a comment -

          Integrated in Cassandra #1069 (See https://builds.apache.org/job/Cassandra/1069/)
          expose ability to only repair the primary range of a node
          patch by slebresne; reviewed by jbellis for CASSANDRA-2606

          slebresne : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1164602
          Files :

          • /cassandra/trunk/CHANGES.txt
          • /cassandra/trunk/NEWS.txt
          • /cassandra/trunk/src/java/org/apache/cassandra/service/StorageService.java
          • /cassandra/trunk/src/java/org/apache/cassandra/service/StorageServiceMBean.java
          • /cassandra/trunk/src/java/org/apache/cassandra/tools/NodeCmd.java
          • /cassandra/trunk/src/java/org/apache/cassandra/tools/NodeProbe.java
          Show
          Hudson added a comment - Integrated in Cassandra #1069 (See https://builds.apache.org/job/Cassandra/1069/ ) expose ability to only repair the primary range of a node patch by slebresne; reviewed by jbellis for CASSANDRA-2606 slebresne : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1164602 Files : /cassandra/trunk/CHANGES.txt /cassandra/trunk/NEWS.txt /cassandra/trunk/src/java/org/apache/cassandra/service/StorageService.java /cassandra/trunk/src/java/org/apache/cassandra/service/StorageServiceMBean.java /cassandra/trunk/src/java/org/apache/cassandra/tools/NodeCmd.java /cassandra/trunk/src/java/org/apache/cassandra/tools/NodeProbe.java

            People

            • Assignee:
              Sylvain Lebresne
              Reporter:
              Sylvain Lebresne
              Reviewer:
              Jonathan Ellis
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 2h
                2h
                Remaining:
                Remaining Estimate - 2h
                2h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development