All truncation do is `cfs.markCompacted(truncatedSSTables)` without holding any lock or anything. Which have the effect of actually deleting sstables that may be compacting. More precisely there is three problems:
- It removes those compacting sstables from the current set of active sstables for the cfs. But when they are done compacting, DataTracker.replaceCompactedSSTables() will be called and it assumes that the compacted sstable are parts of the current set of active sstables. In other words, we'll get an exception looking like the one of
- The result of the compaction will be added as a new active sstable (actually no, because the code will throw an exception before because of the preceding point, but that's something we should probably deal with).
- Currently, compaction don't 'acquire references' on SSTR. That's because the code assumes we won't compact twice the same sstable and that compaction is the only mean to delete an sstable. With these two assumption, acquiring references is not necessary, but truncate break that first assumption.
As for solution, I see two possibilities:
- make the compaction lock be per-cf instead of global (which I think is easy and a good idea anyway) and grab the write lock to do the markCompacted call. The big downside is that truncation will potentially take much longer.
- had two phases: mark the sstable that are not compacting as compacted and set the dataTracker as 'truncated at', and let it deal with the other sstable when their compaction is done. A bit like what is proposed for