Sorry, I got caught up in this over the weekend since the more I dug on
CASSANDRA-5344 the more it looked like I was actually solving this first. I've pushed my first draft to http://github.com/jbellis/cassandra/tree/4180.
I see DefsTest fail occasionally but it is not 100% reproducible, so I'm not sure if it was caused by my changes. I also see CFSTest log errors sometimes but that one is definitely not related (
(Testing was a bitch at first since sstable errors would just cause the schema loader to break violently; hence the option I added to just inject the schema directly, without going through the migration path. That allowed enough tests to run to track down the problems.)
Everything was fairly straightforward except SSTableScanner. (Sounds like the same thing Jason ran into.) I simplified things by noting that seekTo was only used to initialize the scanner to a certain starting point, so I pulled that into the constructor to make seeking mid-iteration a non-concern. (This also allowed removing SSTableBoundedScanner.) I also merged KeyScanningIterator and FilteringKSI; FKSI already had most of the code needed to compute data size from the index entries, which compaction needed to decide whether to use an eager or lazy approach.
There were a lot of places that just one-off sstable reading that were easy to miss. This smells fishy to me but it wasn't obvious how to re-organize things to make it unnecessary, so I haven't tried to solve that here.
I also haven't tried to update scrub for the new format.