[CASSANDRA-10414] Add Internal Support for Reading Multiple Token Ranges with Single Command - ASF JIRA

Log work

Agile Board

Rank to Top

Rank to Bottom

Attach files

Attach Screenshot

Bulk Copy Attachments

Bulk Move Attachments

Add vote

Voters

Watch issue

Watchers

Create sub-task

Convert to sub-task

Move

Link

Clone

Labels

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Normal
Resolution: Unresolved
Fix Version/s: 5.x
Component/s: Legacy/Coordination
Labels:
- performance

Description

Since ~~CASSANDRA-1337~~, we've parallelized fetches of multiple token ranges when handling range slices or secondary index queries. However, a separate command still needs to be created, issued, and handled for (almost) every token range. With vnodes enabled, the number of commands to handle becomes quite large, introducing a lot of extra overhead. If we allow ReadCommands to contain multiple token ranges instead of a single range, we could significantly improve this.

A secondary bonus of doing this is that we will reduce over-fetching of rows. Right now, each command uses the same total limit. If some of the token ranges have more rows than we expect (based on the average partition size and key estimates), the replicas will over-fetch rows, only for them to be discarded by the coordinator. In the worst case scenario, each token range could contain LIMIT rows. With a multi-range command we could still have overfetching, but it would be more tightly bounded: each node could return at most LIMIT rows (across all token ranges combined).