Description
On Cassandra 2.1.14 hen a node gets way behind and has 10s of thousand sstables it appears a lot of the CPU time is spent doing checks like this on a call to getMaxPurgeableTimestamp
org.apache.cassandra.utils.Murmur3BloomFilter.hash(java.nio.ByteBuffer, int, int, long, long[]) @bci=13, line=57 (Compiled frame; information may be imprecise)
- org.apache.cassandra.utils.BloomFilter.indexes(java.nio.ByteBuffer) @bci=22, line=82 (Compiled frame)
- org.apache.cassandra.utils.BloomFilter.isPresent(java.nio.ByteBuffer) @bci=2, line=107 (Compiled frame)
- org.apache.cassandra.db.compaction.CompactionController.maxPurgeableTimestamp(org.apache.cassandra.db.DecoratedKey) @bci=89, line=186 (Compiled frame)
- org.apache.cassandra.db.compaction.LazilyCompactedRow.getMaxPurgeableTimestamp() @bci=21, line=99 (Compiled frame)
- org.apache.cassandra.db.compaction.LazilyCompactedRow.access$300(org.apache.cassandra.db.compaction.LazilyCompactedRow) @bci=1, line=49 (Compiled frame)
- org.apache.cassandra.db.compaction.LazilyCompactedRow$Reducer.getReduced() @bci=241, line=296 (Compiled frame)
- org.apache.cassandra.db.compaction.LazilyCompactedRow$Reducer.getReduced() @bci=1, line=206 (Compiled frame)
- org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext() @bci=44, line=206 (Compiled frame)
- com.google.common.collect.AbstractIterator.tryToComputeNext() @bci=9, line=143 (Compiled frame)
- com.google.common.collect.AbstractIterator.hasNext() @bci=61, line=138 (Compiled frame)
- com.google.common.collect.Iterators$7.computeNext() @bci=4, line=645 (Compiled frame)
- com.google.common.collect.AbstractIterator.tryToComputeNext() @bci=9, line=143 (Compiled frame)
- com.google.common.collect.AbstractIterator.hasNext() @bci=61, line=138 (Compiled frame)
- org.apache.cassandra.db.ColumnIndex$Builder.buildForCompaction(java.util.Iterator) @bci=1, line=166 (Compiled frame)
- org.apache.cassandra.db.compaction.LazilyCompactedRow.write(long, org.apache.cassandra.io.util.DataOutputPlus) @bci=52, line=121 (Compiled frame)
- org.apache.cassandra.io.sstable.SSTableWriter.append(org.apache.cassandra.db.compaction.AbstractCompactedRow) @bci=18, line=193 (Compiled frame)
- org.apache.cassandra.io.sstable.SSTableRewriter.append(org.apache.cassandra.db.compaction.AbstractCompactedRow) @bci=13, line=127 (Compiled frame)
- org.apache.cassandra.db.compaction.CompactionTask.runMayThrow() @bci=666, line=197 (Compiled frame)
- org.apache.cassandra.utils.WrappedRunnable.run() @bci=1, line=28 (Compiled frame)
- org.apache.cassandra.db.compaction.CompactionTask.executeInternal(org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutorStatsCollector) @bci=6, line=73 (Compiled frame)
- org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutorStatsCollector) @bci=2, line=59 (Compiled frame)
- org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run() @bci=125, line=264 (Compiled frame)
- java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511 (Compiled frame)
- java.util.concurrent.FutureTask.run() @bci=42, line=266 (Compiled frame)
- java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1142 (Compiled frame)
- java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 (Compiled frame)
- java.lang.Thread.run() @bci=11, line=745 (Compiled frame)
If we could at least on startup pass a flag like -DskipTombstonePurgeCheck so we could in these particularly bad cases just avoid the calculation and merge tables until we have less to worry about then restart the node with that flag missing once we're down to a more manageable amount of sstables.