Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-18515

Optimize Initial Concurrency Selection for Range Read Algorithm During SAI Queries

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 5.x
    • Feature/2i Index
    • None

    Description

      The range read algorithm relies on the Index API’s notion of estimated result rows to decide how many replicas to contact in parallel during its first round of requests. The more results expected from a replica for a token range, the fewer replicas the range read will initially try to contact. Like SASI, SAI floors that estimate to a huge negative number to make sure it’s selected over other indexes, and this floors the concurrency factor to 1. The actual formula looks like this:

      // resultsPerRange, from SAI, is a giant negative number
      concurrencyFactor = Math.max(1, Math.min(ranges.rangeCount(), (int) Math.ceil(command.limits().count() / resultsPerRange)));
      

      Although that concurrency factor is updated as actual results stream in, only sending a single range request to a single replica in every case for SAI is not ideal. For example, assume I have a 3 node cluster and a keyspace at RF=1, with 10 rows spread across the 3 nodes, without vnodes. Issuing a query that matches all 10 rows with a LIMIT of 10 will make 2 or 3 serial range requests from the coordinator, one to each of the 3 nodes.

      This can be fixed by allowing indexes to bypass the initial concurrency calculation allowing SAI queries to contact the entire ring in a single round of queries, or at worst the minimum number of rounds as bounded by the existing statutory maximum ranges per round.

      Attachments

        Issue Links

          Activity

            People

              mike_tr_adamson Mike Adamson
              mike_tr_adamson Mike Adamson
              Mike Adamson
              Andres de la Peña, Berenguer Blasi, Caleb Rackliffe
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h 10m
                  4h 10m