Details
-
Bug
-
Status: Open
-
Normal
-
Resolution: Unresolved
-
None
-
Degradation - Performance Bug/Regression
-
Normal
-
Normal
-
User Report
-
Java11, Linux
-
None
Description
We are in the process of migrating cassandra from 3.11.x to 4.1.4 and upgrading the sstables using sstableupgrade from Cassandra V4.1.4, from `me-` to `nb-` Format
Unfortunately, the process is very very slow (less than 0.5 MB/s).
Some observations:
- The process is only slow on (fast) SSDs, but not on ram disks.
- The sstables consist of many partitions (this may be unrelated)
- The upgrade process is fast, if we use `automatic_sstable_upgrade` instead of the sstableupgradetool.
- We give enough RAM (export MAX_HEAP_SIZE=8g)
On profiling, we found out, that sstableupgrade is burning most CPU time on posix_fadvise (see flamegraph_sstableupgrade.png ).
My naive interpretation of the whole maybeReopenEarly to posix_fadvise chain is, that the process just informs the linux kernel, that the written data should not be cached. If we comment out the call to NativeLibrary.trySkipCache, the conversion is running at expected 10MB/s (see flamegraph_ok.png )