CASSANDRA-1337, we've parallelized fetches of multiple token ranges when handling range slices or secondary index queries. However, a separate command still needs to be created, issued, and handled for (almost) every token range. With vnodes enabled, the number of commands to handle becomes quite large, introducing a lot of extra overhead. If we allow ReadCommands to contain multiple token ranges instead of a single range, we could significantly improve this.
A secondary bonus of doing this is that we will reduce over-fetching of rows. Right now, each command uses the same total limit. If some of the token ranges have more rows than we expect (based on the average partition size and key estimates), the replicas will over-fetch rows, only for them to be discarded by the coordinator. In the worst case scenario, each token range could contain LIMIT rows. With a multi-range command we could still have overfetching, but it would be more tightly bounded: each node could return at most LIMIT rows (across all token ranges combined).