SSTableIndex instances maintain a Ref to the underlying SSTableReader instance. When determining whether or not to delete the file after the last SSTableIndex reference is released, the implementation uses sstableRef.globalCount(): https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/index/sasi/SSTableIndex.java#L135. This is incorrect because sstableRef.globalCount() returns the number of references to the specific instance of SSTableReader. However, in cases like index summary redistribution, there can be more than one instance of SSTableReader. Further, since the reader is shared across multiple indexes, not all indexes see the count go to 0. This can lead to cases where the SSTableIndex file is incorrectly deleted or not deleted when it should be.
A more correct implementation would be to either:
- Tie into the existing SSTableTidier. SASI indexes already are SSTable components but are not cleaned up by the SSTableTidier because they are not found with the currently cleanup implementation
- Revamp SSTableIndex reference counting to use Ref and implement a new tidier.