If there isn't enough disk space available to compact all existing sstables, Cassandra will attempt to perform a partial compaction by removing sstables from the set of candidate sstables to be compacted, starting with the largest one. It is possible that the sstable removed from the set of sstables to compact contains data for which there are tombstones in another (more recent) sstable. Since the overlaps between sstables is computed when the CompactionController is created, and the CompactionController is created before the removal of any sstables from the set of sstables to be compacted this computed overlap will be outdated when checking which sstables are covered by certain tombstones. This leads to the faulty conclusion that the tombstones can be pruned during the compaction, causing the data to be resurrected.
The issue is present in Cassandra 4.0 and 4.1. Cassandra 3.11 creates the CompactionController after the set of sstables to compact has been reduced, and is thus not affected. trunk does not appear to support partial compactions at all, but instead refuses to compact when the disk is full.
This regression appears to have been introduced by