It is a good question, and I suppose it depends on what was the motivation for row groups in the first place (after all, we've always kind of be able to arbitrary nest, we just have (slightly) simpler way now).
For instance, if the goal is to make sure rows are collocated, having to do it with composite may not be very convenient, in particular if you wan to collocate rows across multiple CF. Of course it is always possible to redesign the model so that you use the same row key and use composite, but that could be really weird. To "solve" that last part, we could provide the row group API but encode it server side with composites.
However, I think we should be aware that pushing such encoding has limitation today:
- there is the same problem that encoding super columns with composite, i.e. we'd need range tombstones.
- rows have a number of subtle limitation that are fine, but may be a bit less fine if you start to push for collocating lots and lots of data under one row:
- There is the 2B columns limit
- If a row is > 2GB, it won't be mmapped
- compaction is slower on big rows
- performance can globally be less good on huge rows
- leveled compaction has at least one row per sstable. Goes a bit against fixed size sstables.
Don't get me wrong, for most case, this is probably fine and we likely want to improve on all of this, but those are still obstacle to co-locating large amount of data under the same row
Now maybe pushing the co-location of data is not a good idea for a distributed store (it obviously raise the question of load balancing in particular), but there is case where careful co-location is paramount to the best performance so giving a good tool for that could have value.
Doing row groups 'natively' would avoids the gotcha above but note that it has at least one drawback: if/once we do
CASSANDRA-2893, isolation for row group encoded with composite type would be a given, with 'native' row group we would have to work a bit.
So overall, I think row group could have an interest API wise, making for a number of more natural modeling. And if we think this is indeed useful, I kind of think doing it natively could be less headache than an encoding with composites overall.