Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-9258

Range movement causes CPU & performance impact

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 2.2.5, 3.0.3, 3.3
    • None
    • None
    • Cassandra 2.1.4

    • Normal

    Description

      Observing big CPU & latency regressions when doing range movements on clusters with many tens of thousands of vnodes. See CPU usage increase by ~80% when a single node is being replaced.

      Top methods are:

      1) Ljava/math/BigInteger;.compareTo in Lorg/apache/cassandra/dht/ComparableObjectToken;.compareTo
      2) Lcom/google/common/collect/AbstractMapBasedMultimap;.wrapCollection in Lcom/google/common/collect/AbstractMapBasedMultimap$AsMap$AsMapIterator;.next
      3) Lorg/apache/cassandra/db/DecoratedKey;.compareTo in Lorg/apache/cassandra/dht/Range;.contains

      Here's a sample stack from a thread dump:

      "Thrift:50673" daemon prio=10 tid=0x00007f2f20164800 nid=0x3a04af runnable [0x00007f2d878d0000]
         java.lang.Thread.State: RUNNABLE
            at org.apache.cassandra.dht.Range.isWrapAround(Range.java:260)
            at org.apache.cassandra.dht.Range.contains(Range.java:51)
            at org.apache.cassandra.dht.Range.contains(Range.java:110)
            at org.apache.cassandra.locator.TokenMetadata.pendingEndpointsFor(TokenMetadata.java:916)
            at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:775)
            at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:541)
            at org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:616)
            at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:1101)
            at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:1083)
            at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:976)
            at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3996)
            at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3980)
            at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
            at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
            at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:205)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
            at java.lang.Thread.run(Thread.java:745)

      Attachments

        1. 0001-pending-ranges-maps-for-2.2.patch
          24 kB
          Dikang Gu
        2. 0001-pending-ranges-map.patch
          22 kB
          Dikang Gu
        3. Screenshot 2015-12-16 16.11.51.png
          543 kB
          Dikang Gu
        4. Screenshot 2015-12-16 16.11.36.png
          540 kB
          Dikang Gu

        Activity

          People

            dikanggu Dikang Gu
            rbranson Rick Branson
            Dikang Gu
            Branimir Lambov
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: