I like the concept, but does this go far enough? If Values aren't special, then are Keys special, and if so then why? Should we make our SortedKeyValueIterator implement Iterable<? extends Object> ? Then the bottom level iterator (RFile reader) would include KeyValue or Entry<Key,Value> objects, the top level iterator for scans would have to have objects that are serializable, and the top level iterator for compactions would have to implement Iterable<Entry<Key,Value>>.
One of the problems we have with iterators now is that the Key and Value are accessed with separate methods, even though they're always read off of disk together. Splitting up the Key and Value on the server side is sort of arbitrary and could reduce our ability to parallelize iterators (if we ever decide that's something we want to do).
Another problem is that SortedKeyValueIterator falls somewhere in between Java's Iterator and Iterable interfaces. SortedKeyValueIterator holds onto filters, aggregation parameters, etc. that make it act like a collection, and it keeps a pointer to somewhere in that collection like an Iterator. I think we should change SortedKeyValueIterator into more like an immutable collection, or a consistent, isolated, unchanging view of the data, and have it implement Iterable. That might open up opportunities for automating optimization of queries on the server side, or better support for built-in iterator tree definition languages.