Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-8494 incremental bootstrap
  3. CASSANDRA-8942

Keep node up even when bootstrap is failed (and provide tool to resume bootstrap)

    Details

      Description

      With CASSANDRA-8838, we can keep bootstrapping node up when some streaming failed, if we provide tool to resume failed bootstrap streaming.

      Failed bootstrap node enters the mode similar to 'write survey mode'. So other nodes in the cluster still view it as bootstrapping, though they send writes to bootstrapping node as well.

      Providing new nodetool command to resume bootstrap from saved bootstrap state, we can continue bootstrapping after resolving issue that caused previous bootstrap failure.

        Issue Links

          Activity

          Hide
          yukim Yuki Morishita added a comment -

          Pushed patch based on CASSANDRA-8838: https://github.com/yukim/cassandra/commits/8942

          This will keep node up after bootstrap failure, and user can use nodetool bootstrap resume to resume bootstrapping.

          I also created dtest here: https://github.com/yukim/cassandra-dtest/tree/CASSANDRA-8942

          Show
          yukim Yuki Morishita added a comment - Pushed patch based on CASSANDRA-8838 : https://github.com/yukim/cassandra/commits/8942 This will keep node up after bootstrap failure, and user can use nodetool bootstrap resume to resume bootstrapping. I also created dtest here: https://github.com/yukim/cassandra-dtest/tree/CASSANDRA-8942
          Hide
          jbellis Jonathan Ellis added a comment -

          Sam Tunnicliffe to review

          Show
          jbellis Jonathan Ellis added a comment - Sam Tunnicliffe to review
          Hide
          yukim Yuki Morishita added a comment -

          Updated both branches for patch and dtest (URL above).
          Ready to be reviewed, thanks!

          Show
          yukim Yuki Morishita added a comment - Updated both branches for patch and dtest (URL above). Ready to be reviewed, thanks!
          Hide
          beobal Sam Tunnicliffe added a comment -

          +1 both LGTM

          trivial nit: should use a wildcard import for o.a.c.streaming in Bootstrapper

          Show
          beobal Sam Tunnicliffe added a comment - +1 both LGTM trivial nit: should use a wildcard import for o.a.c.streaming in Bootstrapper
          Hide
          yukim Yuki Morishita added a comment -

          Thanks for review.
          Committed with nit fix, and added brief description to NEWS.txt.

          dtests pull request is here: https://github.com/riptano/cassandra-dtest/pull/206

          Show
          yukim Yuki Morishita added a comment - Thanks for review. Committed with nit fix, and added brief description to NEWS.txt. dtests pull request is here: https://github.com/riptano/cassandra-dtest/pull/206
          Hide
          pauloricardomg Paulo Motta added a comment -

          Yuki Morishita What happens if I call nodetool bootstrap resume on a node with bootstrap still running? Is the previous bootstrap cancelled or they're run simultaneously?

          It would be nice to provide a way for users to stop/restart hanged bootstraps, if this is not already provided by nodetool bootstrap resume.

          Show
          pauloricardomg Paulo Motta added a comment - Yuki Morishita What happens if I call nodetool bootstrap resume on a node with bootstrap still running? Is the previous bootstrap cancelled or they're run simultaneously? It would be nice to provide a way for users to stop/restart hanged bootstraps, if this is not already provided by nodetool bootstrap resume .
          Hide
          jeromatron Jeremy Hanna added a comment -

          Is there a reason why this could not be done also with nodetool rebuild, to be able to resume that as well?

          Show
          jeromatron Jeremy Hanna added a comment - Is there a reason why this could not be done also with nodetool rebuild, to be able to resume that as well?
          Hide
          yukim Yuki Morishita added a comment -

          I think we can. Would you mind creating ticket?

          Show
          yukim Yuki Morishita added a comment - I think we can. Would you mind creating ticket?
          Hide
          jeromatron Jeremy Hanna added a comment -

          Thanks - CASSANDRA-10810

          Show
          jeromatron Jeremy Hanna added a comment - Thanks - CASSANDRA-10810

            People

            • Assignee:
              yukim Yuki Morishita
              Reporter:
              yukim Yuki Morishita
              Reviewer:
              Sam Tunnicliffe
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development