Details
Description
While running some tests with incremental repair we ran into some issues with some data being repaired over and over again. The repairs where scheduled to run every two hours on a different node. So e.g.
node1 would repair on hours 0, 8, 16 node2 would repair on hours 2, 10, 18 node3 would repair on hours 4, 12, 20 node4 would repair on hours 6, 14, 22
The data being repaired over and over where in a table with static data, so it should've only been required to run repair once for that table. This table generated ~700 small sstables per repair, and when I checked one node had several thousands of sstables in that table alone.
The repair command used on each node was:
repair -inc -par
So after stopping all clients and waiting for compactions to finish I ran sstablemetadata on the tables and saw that one table wasn't repaired. After checking in the logs I something like this:
SSTable ..-ka-X-Data.db (..) will be anticompacted on range (..) ... SSTable ..-ka-X-Data.db (..) does not intersect repaired range (..), not touching repairedAt.
So I checked the code and there seems to be an issue when one of the repaired ranges does not intersect the sstable range. In that case it just removes the sstable from the anticompaction regardless if any other repaired range intersects with it.
Attaching patch for 2.1 that solves this and working on dtest for this. Will create patch for 2.2 and 3.0 as well.