v2 removes commit log header completely in favor of sstable metadata about where to replay (patch against 0.8).
This differs from v1 in that instead of keeping every (segment, replay_position) pair, we keep for a given sstable, only the position for the most recent segment (that is, we leverage the fact that we use increasing timestamps for commit logs).
The reason for this is twofold:
- this more compact (and simple)
- if we remove the commit log header, we need to be able to say if a given segment is dirty or not for a given column family. That is, we don't want to know if some replay position existed on this segment, but if a relevant one still exist. So for a given column family we really only care about the newest (segment, replay_position) pair.
Now there is the question of the update path. With this patch, the (existing) commit log headers will be ignored. This means that ideally before updating to a version having this patch people would use drain. If they do not, then the commit logs will be fully replayed. Pre-0.8, it's not a big deal. With counters, this could mean over-counts (that's exactly what this ticket is about). So I would be in favor of putting this for 0.8.0, since it is a bug fix and it will avoids the problem of upgrading from a version already having counters. But I would admit this is not trivial patch, so ...