Details
-
New Feature
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
None
Description
Currently, users have access to upgrade state serializers on the restore run of a stateful job, as long as the upgraded new serializer remains backwards compatible with all previous written data in the savepoint (i.e. it can read all previous and current schema of serialized state objects).
What is still lacking is the ability to upgrade to incompatible serializers. Upon being registered an incompatible serializer for existing restored state, that state needs to go through the process of -
1. read serialized state with the previous serializer
2. passing each deserialized state object through a “migration map function”, and
3. writing back the state with the new serializer
The availability of this process should be strictly limited to state registrations that occur before the actual processing begins (e.g. in the open or initializeState methods), so that we avoid performing these operations during processing.
How this procedure actually occurs, differs across different types of state backends.
For example, for state backends that eagerly deserialize / lazily serialize state (e.g. HeapStateBackend), the job execution itself can be seen as a "migration"; everything is deserialized to state objects on restore, and is only serialized again, with the new serializer, on checkpoints.
Therefore, for these state backends, the above process is irrelevant.
On the other hand, for state backends that lazily deserialize / eagerly serialize state (e.g. RocksDBStateBackend), the state evolution process needs to happen for every state with a newly registered incompatible serializer.
Procedure 2. will allow even state type migrations, but that is out-of-scope of this JIRA.
This ticket focuses only on procedures 1. and 3., where we try to enable schema evolution without state type changes.
This is an umbrella JIRA ticket that overlooks this feature, including a few preliminary tasks that work towards enabling it.
Attachments
Issue Links
- relates to
-
FLINK-29807 Drop TypeSerializerConfigSnapshot and savepoint support from Flink versions < 1.8.0
- Closed
- links to