Details
-
Improvement
-
Status: Resolved
-
Low
-
Resolution: Fixed
-
None
-
None
Description
As mentioned by zznate at NGCC the compaction task info at the end of the compaction log is pretty confusing.
Mainly we only show the throughput of the sstable writer. But if there is a lot of merging being done it might look like compaction is really really slow since the output might be small but the inputs were huge.
Also bytes/sec isn't a great metric of work really we should be reporting the CQL row throughput. since for the same bytes on disk we might be compacting 100k rows or 1 large one.
I've added a trivial patch that improves the logging info to now show Read Throughput, Write Throughput, Rows sec and total source partitions.
DEBUG [CompactionExecutor:1] 2016-06-23 12:22:06,114 CompactionTask.java:229 - Compacted (9edcfa50-395e-11e6-9944-3109153b1592) 2 sstables to [/home/jake/workspace/cassandra/data/data/stresscql/userpics-b9d2811038b711e69c04018b580faf7b/mb-11-big,] to level=0. 13.159MiB to 6.590MiB (~50% of original) in 2,474ms. Read Throughput = 5.317MiB/s, Write Throughput = 2.663MiB/s, Row Throughput = ~166,666/s. 500,000 total partitions merged to 250,000. Partition merge counts were {2:250000, }