This is indeed compaction related (but not related to secondary indexing at
all). The problem is that compaction may lose some rows.
Because of the way the ReducingIterator works, when we create a new
CompactedRow, we have already decoded the next row key and
the file pointer if after that next row key. Both PreCompactedRow and
LazyCompactedRow handle this correctly by "resetting" their
SSTableIdentityIterator before reading (SSTII.getColumnFamilyWithColumns()
does it for PreCompactedRow and LazilyCompactedRow calls SSTII.reset()
directly). But EchoedRow doesn't handle this correctly. Hence when
EchoedRow.isEmpty() is called, it will call SSTII.hasNext(), that will compare
the current file pointer to the finishedAt value of the iterator. The pointer
being on the next row, this test will always fail and the row will be skipped.
Attaching a patch against 0.8 with a (smaller) unit test.
Note that luckily this doesn't affect 0.7, because it only uses EchoedRow for
cleanup compactions and clean compactions does not use ReducingIterator (and
thus, the underlying SSTII won't have changed when the EchoedRow is built).
I would still be in favor of committing the patch there too, just to make sure
we don't hit this later.