On some deployments I have seen tar files with a quite hight generation post-fix (e.g. 'v'). From the log files I could deduce that this particular tar file was rewritten multiple times without actually any segment being removed.
I assume this is caused by the 25% gain threshold not taking the sizes contributed by the index and the graph entries into account.
We should try to come up with a test case validating above hypothesis. A fix should then be relatively straight forward: either include the sizes of these two entries in the calculation or skip further clean cycles if a file size drops below a certain size.