Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-16545

Cluster topology change may produce false unavailable for queries

    XMLWordPrintableJSON

Details

    Description

      When the coordinator processes a query, it first gets the ReplicationStrategy (RS) from the keyspace to decide the peers to contact. Again, it gets the RS to perform the liveness check for the requested CL.

      The RS is a volatile filed in Keyspace, and it is possible that those 2 getter calls return different RS values in the presence of cluster topology changes, e.g. add a node, etc.

      In such scenario, the check at the second step can throw an unexpected unavailable. From the perspective of the query, the cluster can satisfy the CL.

      We should use a consistent view of RS during the peer selection and CL liveness check. In other word, both steps should reference to the same RS object. It is also more clear and easier to reason about to the clients. Such queries are made before the topology change.

      Attachments

        Activity

          People

            yifanc Yifan Cai
            yifanc Yifan Cai
            Yifan Cai
            Aleksey Yeschenko, Andres de la Peña
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 10m
                1h 10m