Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-16525

Gossip STATUS can be either missing during upgrade or stale after upgrade

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 4.0-rc1, 4.0
    • Cluster/Gossip
    • None

    Description

      In 4.0, new application states are added in Gossip and the corresponding old ones are deprecated, e.g. STATUS and the successor STATUS_WITH_PORT.

      There are 2 issues discovered by the jvm (upgrade) dtest. First, the STATUS field of a peer in the lower version (e.g. 3.0) node can be missing. Second, it is possible the STATUS coexist with the new state STATUS_WITH_PORT in the 4.0 nodes after cluster is fully upgraded and the STATUS field can becomes stale as the 4.0 node filters out when applying new state.

      The first issue can happen in this scenario. During upgrade, node1 and node2 are in v4, and node3 is still in v3. If node3 only gets the gossip info regarding node2 from node1, the STATUS field of node2 will be missing in node3's local state, which is unexpected. There could be many reasons that node3 does not exchange gossip with node2 directly, e.g. network issue between node2 and node3, or node2 simply does not select node3 when initiating the gossip round. Gossip should be resilient to it. I have a jvm upgrade dtest to demonstrate the unexpected behavior.

      The cause of the second issue is more subtle. Heartbeat update happens as part of the Gossip task and outside of the GossipStage. When node2 just update its local application state and received a SYN from node1, node 2 just replies its gossip state without updating the heartbeat version. When node1 receives it, it first filters out the legacy STATUS field, and only saves the new one. So far so good. However, node2 soon updates its heart beat, and node1 realizes that its local version is less than the remove (node2) version in the next gossip round. So node2 sends STATUS along to node1. Because it does not come together with the new field, node1 does not filter it out when receiving. Boo! Node1 now has the STATUS field from node2. Such field can become stale and diverge with its successor in a live cluster. The jvm upgrade test testStatusFieldShouldExistInOldVersionNodes can fairly easy to reproduce it when the entire cluster is upgraded. And there is another jvm dtest (with source changes to help make deterministic result, see the attached Demonstrate-a-scenario-that-a-node-may-hold-the-stale-status.patch) that demonstrates the STATUS can be replicated to the peer and become stale.

      The fix is to
      1) retain the legacy fields if the cluster is still in mixed mode
      2) remove the legacy field when cluster is fully upgraded

      Attachments

        Issue Links

          Activity

            People

              yifanc Yifan Cai
              yifanc Yifan Cai
              Yifan Cai
              Michael Semb Wever
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 3h
                  3h