Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-10414

Add Internal Support for Reading Multiple Token Ranges with Single Command

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Since CASSANDRA-1337, we've parallelized fetches of multiple token ranges when handling range slices or secondary index queries. However, a separate command still needs to be created, issued, and handled for (almost) every token range. With vnodes enabled, the number of commands to handle becomes quite large, introducing a lot of extra overhead. If we allow ReadCommands to contain multiple token ranges instead of a single range, we could significantly improve this.

      A secondary bonus of doing this is that we will reduce over-fetching of rows. Right now, each command uses the same total limit. If some of the token ranges have more rows than we expect (based on the average partition size and key estimates), the replicas will over-fetch rows, only for them to be discarded by the coordinator. In the worst case scenario, each token range could contain LIMIT rows. With a multi-range command we could still have overfetching, but it would be more tightly bounded: each node could return at most LIMIT rows (across all token ranges combined).

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned Assign to me
            thobbs Tom Hobbs

            Dates

              Created:
              Updated:

              Slack

                Issue deployment