Other things being equal, I would prefer to provide a thrift interface to add/rename/remove keyspace and CFs through a single coordinator node (vs having to update each node via JMX, or push out a new config file). Keeping things config-file based has two drawbacks:
- it requires filesystem access for whoever is doing the update, which is problematic in some environments
- it makes life difficult for systems building on top of cassandra that want to automate this (easy for a human to dsh scp from somewhere; possible, but painful, to integrate this into an automated system that is more than a one-off)
- it requires either all nodes being up for the upgrade, which is simple but unrealistic, or ops manually re-pushing the update to nodes that are down, which is a pita
So if we can instead move to a system where KS/CF definitions are stored in a system CF and updated programatically, I think that would be best.
Possible evolution of the code might look like
(1) move KS/CF definitions into the system table
(2) add schema change methods internally and tests (possibly expose via JMX for manual testing, but not nodeprobe)
(3) add thrift interface to send schema changes out to other nodes
(4) add gossip of MetadataVersion (a user provided? automatically generated? identifier string): gossip automatically handles updating nodes that were down on what happened while they were out. Full schema will not fit in gossip but a version id will. A node whose internal MV is lower than one it sees in gossip, should ask the node w/ the higher version to send it the new version. (Remember we cannot rely on HH for this since the FD may not have recognized that the node was down when the update was happening).
We punt completely on two clients requesting conflicting changes from different coordinator nodes. "Don't do that." (Just as copying out two conflicting config files is Bad.)
One possible layout for the metadata CFs:
migrations: hardcoded key of "migrations": each column w/ name of MetadataVersion contains the op performed
schema: key of MV, supercolumns of KS, columns of serialized CF definitions
so on startup, we read latest MV from migrations row, then the associated schema.
(Looked at this way it seems like we should just have MV be a TimeUUID and not make client deal w/ that.)