Details
-
Improvement
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
Description
The range read algorithm relies on the Index API’s notion of estimated result rows to decide how many replicas to contact in parallel during its first round of requests. The more results expected from a replica for a token range, the fewer replicas the range read will initially try to contact. Like SASI, SAI floors that estimate to a huge negative number to make sure it’s selected over other indexes, and this floors the concurrency factor to 1. The actual formula looks like this:
// resultsPerRange, from SAI, is a giant negative number concurrencyFactor = Math.max(1, Math.min(ranges.rangeCount(), (int) Math.ceil(command.limits().count() / resultsPerRange)));
Although that concurrency factor is updated as actual results stream in, only sending a single range request to a single replica in every case for SAI is not ideal. For example, assume I have a 3 node cluster and a keyspace at RF=1, with 10 rows spread across the 3 nodes, without vnodes. Issuing a query that matches all 10 rows with a LIMIT of 10 will make 2 or 3 serial range requests from the coordinator, one to each of the 3 nodes.
This can be fixed by allowing indexes to bypass the initial concurrency calculation allowing SAI queries to contact the entire ring in a single round of queries, or at worst the minimum number of rounds as bounded by the existing statutory maximum ranges per round.
Attachments
Attachments
Issue Links
- links to