Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-12510

Disallow decommission when number of replicas will drop below configured RF

    XMLWordPrintableJSON

Details

    Description

      Steps to replicate :

      • Create a 3 node cluster in DC1 and create a keyspace test_keyspace with table test_table with replication strategy NetworkTopologyStrategy , DC1=3 . Populate some data into this table.
      • Add 5 more nodes to this cluster, but in DC2. Also do not alter the keyspace to add the new DC2 to replication (this is intentional and the reason why the bug shows up). So the desc keyspace should still list NetworkTopologyStrategy with DC1=3 as RF
      • As expected, this will now be a 8 node cluster with 3 nodes in DC1 and 5 in DC2
      • Now start decommissioning the nodes in DC1. Note that the decommission runs fine on all the 3 nodes, but since the new nodes are in DC2 and the RF for keyspace is restricted to DC1, the new 5 nodes won't get any data.
      • You will now end with the 5 node cluster which has no data from the decommissioned 3 nodes and hence ending up in data loss

      I do understand that this problem could have been avoided if we perform an alter stmt and add DC2 replication before adding the 5 nodes. But the fact that decommission ran fine on the 3 nodes on DC1 without complaining that there were no nodes to stream its data seems a little discomforting.

      Attachments

        1. 12510-3.x.patch
          7 kB
          Kurt Greaves
        2. 12510-3.x-v2.patch
          8 kB
          Kurt Greaves

        Activity

          People

            KurtG Kurt Greaves
            asood Atin Sood
            Kurt Greaves
            Paulo Motta
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: