The format in which result sets are serialized in the native protocol has the advantage of being very simple (which was, initially, a feature), but it isn't very optimal. It's probably now time to think about optimizing it further.
At the very least, there is 2 simple optimizations we can do:
- we can avoid the repetition of partition key columns (as well as duplicate clustering column value when we have more than one clustering column, though we'd have to recompute this at the CQL level since we don't have this "optimization" internally (yet at least), contrarily to the partition key case).
- we can optimize the serialization of value of fixed-width type (like we now do internally) by skipping the current 4 byte length. We could also maybe use vints for the remaining case where there is a length.
But of course, it's worth considering other potential optimization while we're at it.