Details
-
Improvement
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
None
-
None
Description
There are still a few serializers in Flink that needs to be upgraded to use the new serialization compatibility APIs (TypeSerializerSnapshot and TypeSerializerSchemaCompatibility).
By doing this as soon as possible (ideally all are migrated for the 1.8 release), state serializers would no longer be written via Java serialization into savepoints, and in general allowing serializer upgrades more future-proof.
To split up the efforts, the remaining serializers can be categorized as follows.
They are categorized because some are fairly straightforward, while some are more complex and involves reconfiguration / migration cases.
We should have an independent sub-task JIRA for each category:
- Parameterless serializers / subclasses of TypeSerializerSingleton
- Simple composite serializers that contain nested serializers
- Scala-macro generated serializers
- POJOSerializer
- Kryo-related serializers
- Enum serializers
After upgrading a serializers' snapshot, the following should be achieved to consider the migration completed:
- The serializer should now return a TypeSerializerSnapshot, and not a subclass of the legacy TypeSerializerConfigSnapshot as its snapshot.
- Check whether or not the legacy TypeSerializer#ensureCompatibility method can be removed from the serializer class.
- There is a corresponding test specification for the migrated serializer that uses the TypeSerializerSnapshotMigrationTestBase.
Ideally, to complete this JIRA, we should also have some utilities / meta-tests:
- A test that verifies ALL serializers in the Flink codebase have a corresponding migration test that uses the TypeSerializerSnapshotMigrationTestBase.
- A utility script that generates test data and snapshots of all serializers in the Flink codebase, to be used by the migration tests. This script should be used every time we release a Flink major version.