Description
Right now a change in the schema for persisted Thrift structs is handled by a rather large "backfill" method that translates from the old schema to the new one. This is a potential source of datastore corruption bugs.
Some thoughts on the overall design of this "backfill" mechanism. The first is that any structs we persist should have a SCHEMA_VERSION or some such identifier that gets incremented on subsequent version changes (we could use also use heuristics here but I think a monotonically increasing version number is cleaner). The second is that we should define, essentially, a state machine that takes a V1 struct to a V2 struct (and potentially provides an inverse operation, but that's not necessary). Basically a piece of code that knows how to make a valid V1 struct a valid V2 struct. Part of any schema change will be adding this code, and in this way we will be able to load arbitrarily old persisted structs while keeping the code to do so relatively clean (each "backfill migration" is a tiny piece of code).