Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-12281

Gossip blocks on startup when there are pending range movements

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 2.2.9, 3.0.11, 3.10
    • Legacy/Core
    • None
    • Normal

    Description

      In our cluster, normal node startup times (after a drain on shutdown) are less than 1 minute. However, when another node in the cluster is bootstrapping, the same node startup takes nearly 30 minutes to complete, the apparent result of gossip blocking on pending range calculations.

      $ nodetool-a tpstats
      Pool Name                    Active   Pending      Completed   Blocked  All time blocked
      MutationStage                     0         0           1840         0                 0
      ReadStage                         0         0           2350         0                 0
      RequestResponseStage              0         0             53         0                 0
      ReadRepairStage                   0         0              1         0                 0
      CounterMutationStage              0         0              0         0                 0
      HintedHandoff                     0         0             44         0                 0
      MiscStage                         0         0              0         0                 0
      CompactionExecutor                3         3            395         0                 0
      MemtableReclaimMemory             0         0             30         0                 0
      PendingRangeCalculator            1         2             29         0                 0
      GossipStage                       1      5602            164         0                 0
      MigrationStage                    0         0              0         0                 0
      MemtablePostFlush                 0         0            111         0                 0
      ValidationExecutor                0         0              0         0                 0
      Sampler                           0         0              0         0                 0
      MemtableFlushWriter               0         0             30         0                 0
      InternalResponseStage             0         0              0         0                 0
      AntiEntropyStage                  0         0              0         0                 0
      CacheCleanupExecutor              0         0              0         0                 0
      
      Message type           Dropped
      READ                         0
      RANGE_SLICE                  0
      _TRACE                       0
      MUTATION                     0
      COUNTER_MUTATION             0
      REQUEST_RESPONSE             0
      PAGED_RANGE                  0
      READ_REPAIR                  0
      

      A full thread dump is attached, but the relevant bit seems to be here:

      [ ... ]
      
      "GossipStage:1" #1801 daemon prio=5 os_prio=0 tid=0x00007fe4cd54b000 nid=0xea9 waiting on condition [0x00007fddcf883000]
         java.lang.Thread.State: WAITING (parking)
      	at sun.misc.Unsafe.park(Native Method)
      	- parking to wait for  <0x00000004c1e922c0> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
      	at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
      	at org.apache.cassandra.locator.TokenMetadata.updateNormalTokens(TokenMetadata.java:174)
      	at org.apache.cassandra.locator.TokenMetadata.updateNormalTokens(TokenMetadata.java:160)
      	at org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:2023)
      	at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1682)
      	at org.apache.cassandra.gms.Gossiper.doOnChangeNotifications(Gossiper.java:1182)
      	at org.apache.cassandra.gms.Gossiper.applyNewStates(Gossiper.java:1165)
      	at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1128)
      	at org.apache.cassandra.gms.GossipDigestAckVerbHandler.doVerb(GossipDigestAckVerbHandler.java:58)
      	at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      
      [ ... ]
      

      Attachments

        1. 12281-2.2.patch
          23 kB
          Stefan Podkowinski
        2. 12281-3.0.patch
          23 kB
          Stefan Podkowinski
        3. 12281-3.X.patch
          23 kB
          Stefan Podkowinski
        4. 12281-trunk.patch
          23 kB
          Stefan Podkowinski
        5. restbase1015-a_jstack.txt
          368 kB
          Eric Evans

        Issue Links

          Activity

            People

              spod Stefan Podkowinski
              urandom Eric Evans
              Stefan Podkowinski
              Joel Knighton
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: