CASSANDRA-10254, the paging states generated by 3.0 for the native protocol v4 were made 3.0 specific. This was done because the paging state in pre-3.0 versions contains a serialized cell name, but 3.0 doesn't talk in term of cells internally (at least not the pre-3.0 ones) and so using an old-format cell name when we only have 3.0 nodes is inefficient and inelegant.
Unfortunately that change was made on the assumption than the protocol v4 was 3.0 only but it's not, it ended up being released with 2.2 and that completely slipped my mind. So in practice, you can't properly have a mixed 2.2/3.0 cluster if your driver is using the protocol v4.
And unfortunately, I don't think there is an easy way to fix that without breaking something. Concretely, I can see 3 choices:
- we change 3.0 so that it generates old-format paging states on the v4 protocol. The 2 main downsides are that 1) this breaks 3.0 upgrades if the driver is using the v4 protocol, and at least on the java side the only driver versions that support 3.0 will use v4 by default and 2) we're signing off on having sub-optimal paging state until the protocol v5 ships (probably not too soon).
- we remove the v4 protocol from 2.2. This means 2.2 will have to use v3 before upgrade at the risk of breaking upgrade. This is also bad, but I'm not sure the driver version using the v4 protocol are quite ready yet (at least the java driver is not GA yet) so if we work with the drivers teams to make sure the v3 protocol gets prefered by default on 2.2 in the GA versions of these driver, this might be somewhat transparent to users.
- we don't change anything code-wise, but we document clearly that you can't upgrade from 2.2 to 3.0 if your clients use protocol v4 (so we leave upgrade broken if the v4 protocol is used as it is currently). This is not great, but we can work with the drivers teams here again to make sure drivers prefer the v3 version for 2.2 nodes so most people don't notice in practice.
I think I'm leaning towards solution 3). It's not great but at least we break no minor upgrades (neither on 2.2, nor on 3.0) which is probably the most important. We'd basically be just adding a new condition on 2.2->3.0 upgrades. We could additionally make 3.0 node completely refuse v4 connections if they know a 2.2 nodes is in the cluster for extra safety.