Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-16364

Joining nodes simultaneously with auto_bootstrap:false can cause token collision

Agile BoardAttach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Normal
    • Resolution: Unresolved
    • Fix Version/s: 4.0.x
    • Component/s: Cluster/Membership
    • Labels:
      None
    • Bug Category:
      Correctness - Consistency
    • Severity:
      Normal
    • Complexity:
      Normal
    • Discovered By:
      User Report
    • Platform:
      All
    • Impacts:
      None

      Description

      While raising a 6-node ccm cluster to test 4.0-beta4, 2 nodes chosen the same tokens using the default allocate_tokens_for_local_rf. However they both succeeded bootstrap with colliding tokens.

      We were familiar with this issue from CASSANDRA-13701 and CASSANDRA-16079, and the workaround to fix this is to avoid parallel bootstrap when using allocate_tokens_for_local_rf.

      However, since this is the default behavior, we should try to detect and prevent this situation when possible, since it can break users relying on parallel bootstrap behavior.

      I think we could prevent this as following:
      1. announce intent to bootstrap via gossip (ie. add node on gossip without token information)
      2. wait for gossip to settle for a longer period (ie. ring delay)
      3. allocate tokens (if multiple bootstrap attempts are detected, tie break via node-id)
      4. broadcast tokens and move on with bootstrap

        Attachments

        Issue Links

          Activity

            People

            • Assignee:
              Unassigned Assign to me
              Reporter:
              paulo Paulo Motta
              Reviewers:
              Michael Semb Wever

              Dates

              • Created:
                Updated:

                Issue deployment