Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-13348

Duplicate tokens after bootstrap

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Urgent
    • Resolution: Cannot Reproduce
    • 3.0.18
    • None
    • None
    • Critical

    Description

      This one is a bit scary, and probably results in data loss. After a bootstrap of a few new nodes into an existing cluster, two new nodes have chosen some overlapping tokens.

      In fact, of the 256 tokens chosen, 51 tokens were already in use on the other node.

      Node 1 log :

      INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 StorageService.java:1160 - JOINING: waiting for ring information
      INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 StorageService.java:1160 - JOINING: waiting for schema information to complete
      INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
      INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 StorageService.java:1160 - JOINING: waiting for pending range calculation
      INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
      INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 StorageService.java:1160 - JOINING: getting bootstrap token
      WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 TokenAllocation.java:61 - Selected tokens [............, 2959334889475814712, 3727103702384420083, 7183119311535804926, 6013900799616279548, -1222135324851761575, 1645259890258332163, -1213352346686661387, 7604192574911909354]
      WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 TokenAllocation.java:65 - Replicated node load in datacentre before allocation max 1.00 min 1.00 stddev 0.0000
      WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 TokenAllocation.java:66 - Replicated node load in datacentre after allocation max 1.00 min 1.00 stddev 0.0000
      WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 TokenAllocation.java:70 - Unexpected growth in standard deviation after allocation.
      INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 StorageService.java:1160 - JOINING: sleeping 30000 ms for pending range setup
      INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 StorageService.java:1160 - JOINING: Starting to bootstrap...
      

      Node 2 log:

      INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 StorageService.java:971 - Joining ring by operator request
      INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 StorageService.java:1160 - JOINING: waiting for ring information
      INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 StorageService.java:1160 - JOINING: waiting for schema information to complete
      INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
      INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 StorageService.java:1160 - JOINING: waiting for pending range calculation
      INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
      INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 StorageService.java:1160 - JOINING: getting bootstrap token
      WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 TokenAllocation.java:61 - Selected tokens [......, 2890709530010722764, -2416006722819773829, -5820248611267569511, -5990139574852472056, 1645259890258332163, 9135021011763659240, -5451286144622276797, 7604192574911909354]
      WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 TokenAllocation.java:65 - Replicated node load in datacentre before allocation max 1.02 min 0.98 stddev 0.0000
      WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 TokenAllocation.java:66 - Replicated node load in datacentre after allocation max 1.00 min 1.00 stddev 0.0000
      INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 StorageService.java:1160 - JOINING: sleeping 30000 ms for pending range setup
      INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:56:23,149 StorageService.java:1160 - JOINING: Starting to bootstrap...
      

      eg. 7604192574911909354 has been chosen by both.

      The joins were eight days apart, so I don't think it's a race

      Attachments

        Issue Links

          Activity

            People

              dikanggu Dikang Gu
              tvdw Tom van der Woerdt
              Dikang Gu
              Votes:
              0 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: