Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-16364

Joining nodes simultaneously with auto_bootstrap:false can cause token collision

Agile BoardAttach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Normal
    • Resolution: Unresolved
    • 4.0.x
    • Cluster/Membership
    • None
    • Correctness - Consistency
    • Normal
    • Normal
    • User Report
    • All
    • None

    Description

      While raising a 6-node ccm cluster to test 4.0-beta4, 2 nodes chosen the same tokens using the default allocate_tokens_for_local_rf. However they both succeeded bootstrap with colliding tokens.

      We were familiar with this issue from CASSANDRA-13701 and CASSANDRA-16079, and the workaround to fix this is to avoid parallel bootstrap when using allocate_tokens_for_local_rf.

      However, since this is the default behavior, we should try to detect and prevent this situation when possible, since it can break users relying on parallel bootstrap behavior.

      I think we could prevent this as following:
      1. announce intent to bootstrap via gossip (ie. add node on gossip without token information)
      2. wait for gossip to settle for a longer period (ie. ring delay)
      3. allocate tokens (if multiple bootstrap attempts are detected, tie break via node-id)
      4. broadcast tokens and move on with bootstrap

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned Assign to me
            paulo Paulo Motta
            Michael Semb Wever

            Dates

              Created:
              Updated:

              Issue deployment