Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-4650

RangeStreamer should be smarter when picking endpoints for streaming in case of N >=3 in each DC.

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Fix Version/s: 4.0
    • Component/s: None
    • Labels:

      Description

      getRangeFetchMap method in RangeStreamer should pick unique nodes to stream data from when number of replicas in each DC is three or more.
      When N>=3 in a DC, there are two options for streaming a range. Consider an example of 4 nodes in one datacenter and replication factor of 3.
      If a node goes down, it needs to recover 3 ranges of data. With current code, two nodes could get selected as it orders the node by proximity.
      We ideally will want to select 3 nodes for streaming the data. We can do this by selecting unique nodes for each range.

      Advantages:
      This will increase the performance of bootstrapping a node and will also put less pressure on nodes serving the data.

      Note: This does not affect if N < 3 in each DC as then it streams data from only 2 nodes.

        Attachments

        1. photo-1.JPG
          1.76 MB
          sankalp kohli
        2. CASSANDRA-4650_trunk.txt
          32 kB
          sankalp kohli

          Issue Links

            Activity

              People

              • Assignee:
                kohlisankalp sankalp kohli
                Reporter:
                kohlisankalp sankalp kohli
                Reviewer:
                Marcus Eriksson
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 24h
                  24h
                  Remaining:
                  Remaining Estimate - 24h
                  24h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified