Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-6869

Scala serializers do not have the serialVersionUID specified

    Details

      Description

      Currently, all Scala serializers, e.g. OptionSerializer, CaseClassSerializer, TrySerializer etc. do not have the serialVersionUID specified.

      In 1.3, the Scala serializer (all serializers in general) implementations had to be changed since implementation of the compatibility methods snapshotConfiguration, ensureCompatibility had to be implemented, resulting in a new serialVersionUID.
      This means that when restoring from a snapshot pre-1.3 that contains Scala types as state, the previous serializer in the snapshot cannot be deserialized (due to UID mismatch).

      To fix this, we should specify the serialVersionUIDs of the Scala serializers to be what they originally were pre-1.3. This would then allow users with Scala types as state to restore from older versions.

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user tzulitai opened a pull request:

          https://github.com/apache/flink/pull/4090

          FLINK-6869 [scala] Specify serialVersionUID for all Scala serializers

          This PR fixes 2 issues:

          1. Configuration snapshots of Scala serializers were not readable:
          Prior to this PR, the configuration snapshot classes of Scala serializers did not have the proper default empty constructor that is used for deserializing the configuration snapshot.

          Since some Scala serializers' config snapshots extend the Java `CompositeTypeSerializerConfigSnapshot`, their config snapshot classes are also changed to be implemented in Java since in Scala we can only call a single base class constructor from subclasses.

          2. Scala serializers did not specify the serialVersionUID:
          Previously, Scala serializers did not specify the `serialVersionUID`, and therefore prohibited restore from previous Flink version snapshots because the serializers' implementations changed in 1.3.

          The `serialVersionUID`s added in this PR are identical to what they were (as generated by Java) in Flink 1.2, so that we can at least restore state that were written with the Scala serializers as of 1.2.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/tzulitai/flink FLINK-6869

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/flink/pull/4090.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #4090


          commit 416bd3b122e79bdd8b5876e8d645b346110b67f0
          Author: Tzu-Li (Gordon) Tai <tzulitai@apache.org>
          Date: 2017-06-08T06:52:04Z

          [hotfix] [scala] Fix instantiation of Scala serializers' config snapshot classes

          Prior to this commit, the configuration snapshot classes of Scala
          serializers did not have the proper default empty constructor that is
          used for deserializing the configuration snapshot.

          Since some Scala serializers' config snapshots extend the Java
          CompositeTypeSerializerConfigSnapshot, their config snapshot classes are
          also changed to be implemented in Java since in Scala we can only call a
          single base class constructor from subclasses.

          commit 16574c6623dd64846c888e6a608deb9ae3f081bd
          Author: Tzu-Li (Gordon) Tai <tzulitai@apache.org>
          Date: 2017-06-08T13:29:45Z

          FLINK-6869 [scala] Specify serialVersionUID for all Scala serializers

          Previously, Scala serializers did not specify the serialVersionUID, and
          therefore prohibited restore from previous Flink version snapshots
          because the serializers' implementations changed.

          The serialVersionUIDs added in this commit are identical to what they
          were (as generated by Java) in Flink 1.2, so that we can at least
          restore state that were written with the Scala serializers as of 1.2.


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user tzulitai opened a pull request: https://github.com/apache/flink/pull/4090 FLINK-6869 [scala] Specify serialVersionUID for all Scala serializers This PR fixes 2 issues: 1. Configuration snapshots of Scala serializers were not readable: Prior to this PR, the configuration snapshot classes of Scala serializers did not have the proper default empty constructor that is used for deserializing the configuration snapshot. Since some Scala serializers' config snapshots extend the Java `CompositeTypeSerializerConfigSnapshot`, their config snapshot classes are also changed to be implemented in Java since in Scala we can only call a single base class constructor from subclasses. 2. Scala serializers did not specify the serialVersionUID: Previously, Scala serializers did not specify the `serialVersionUID`, and therefore prohibited restore from previous Flink version snapshots because the serializers' implementations changed in 1.3. The `serialVersionUID`s added in this PR are identical to what they were (as generated by Java) in Flink 1.2, so that we can at least restore state that were written with the Scala serializers as of 1.2. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tzulitai/flink FLINK-6869 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4090.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4090 commit 416bd3b122e79bdd8b5876e8d645b346110b67f0 Author: Tzu-Li (Gordon) Tai <tzulitai@apache.org> Date: 2017-06-08T06:52:04Z [hotfix] [scala] Fix instantiation of Scala serializers' config snapshot classes Prior to this commit, the configuration snapshot classes of Scala serializers did not have the proper default empty constructor that is used for deserializing the configuration snapshot. Since some Scala serializers' config snapshots extend the Java CompositeTypeSerializerConfigSnapshot, their config snapshot classes are also changed to be implemented in Java since in Scala we can only call a single base class constructor from subclasses. commit 16574c6623dd64846c888e6a608deb9ae3f081bd Author: Tzu-Li (Gordon) Tai <tzulitai@apache.org> Date: 2017-06-08T13:29:45Z FLINK-6869 [scala] Specify serialVersionUID for all Scala serializers Previously, Scala serializers did not specify the serialVersionUID, and therefore prohibited restore from previous Flink version snapshots because the serializers' implementations changed. The serialVersionUIDs added in this commit are identical to what they were (as generated by Java) in Flink 1.2, so that we can at least restore state that were written with the Scala serializers as of 1.2.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user tzulitai commented on the issue:

          https://github.com/apache/flink/pull/4090

          R: @StefanRRichter @aljoscha tagging you because I talked to you about the issue offline Could you have a quick look?

          Show
          githubbot ASF GitHub Bot added a comment - Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/4090 R: @StefanRRichter @aljoscha tagging you because I talked to you about the issue offline Could you have a quick look?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user tzulitai commented on the issue:

          https://github.com/apache/flink/pull/4090

          One caveat that this PR does not yet fully fix:
          the deserialization of the anonymous class serializers (`CaseClassSerializer` and `TraversableSerializer`), even with the `serialVersionUID` specified, can still fail because there is no guarantee of the generated classname of anonymous classes (it depends on the order of when the anonymous classes were instantiated, and format seems to also change across compilers).

          At this moment, I've hit a bit of a wall trying to resolve this. The problem was always there pre-1.3, as if users simply change the order of their Scala type serializer generation (simply changing call order of `createTypeInformation` for their Scala types), the classnames would change and they wouldn't be able to restore state.

          Show
          githubbot ASF GitHub Bot added a comment - Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/4090 One caveat that this PR does not yet fully fix: the deserialization of the anonymous class serializers (`CaseClassSerializer` and `TraversableSerializer`), even with the `serialVersionUID` specified, can still fail because there is no guarantee of the generated classname of anonymous classes (it depends on the order of when the anonymous classes were instantiated, and format seems to also change across compilers). At this moment, I've hit a bit of a wall trying to resolve this. The problem was always there pre-1.3, as if users simply change the order of their Scala type serializer generation (simply changing call order of `createTypeInformation` for their Scala types), the classnames would change and they wouldn't be able to restore state.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StefanRRichter commented on the issue:

          https://github.com/apache/flink/pull/4090

          Overall, I think this is ok as a best effort until we have some eager registration that helps with the remaining problems in the heap backend.

          Show
          githubbot ASF GitHub Bot added a comment - Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/4090 Overall, I think this is ok as a best effort until we have some eager registration that helps with the remaining problems in the heap backend.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user aljoscha commented on a diff in the pull request:

          https://github.com/apache/flink/pull/4090#discussion_r121384869

          — Diff: flink-core/src/main/java/org/apache/flink/api/common/typeutils/TypeSerializerSerializationUtil.java —
          @@ -362,8 +407,14 @@ public void read(DataInputView in) throws IOException {
          int serializerBytes = in.readInt();
          byte[] buffer = new byte[serializerBytes];
          in.readFully(buffer);

          • try {
          • typeSerializer = InstantiationUtil.deserializeObject(buffer, userClassLoader);
            +
            + ClassLoader old = Thread.currentThread().getContextClassLoader();
              • End diff –

          Maybe rename to `previousClassLoader`

          Show
          githubbot ASF GitHub Bot added a comment - Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/4090#discussion_r121384869 — Diff: flink-core/src/main/java/org/apache/flink/api/common/typeutils/TypeSerializerSerializationUtil.java — @@ -362,8 +407,14 @@ public void read(DataInputView in) throws IOException { int serializerBytes = in.readInt(); byte[] buffer = new byte [serializerBytes] ; in.readFully(buffer); try { typeSerializer = InstantiationUtil.deserializeObject(buffer, userClassLoader); + + ClassLoader old = Thread.currentThread().getContextClassLoader(); End diff – Maybe rename to `previousClassLoader`
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/flink/pull/4090

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/4090
          Hide
          tzulitai Tzu-Li (Gordon) Tai added a comment -

          Fixed for 1.3.1 via 7aafaf6aa549c30257cf6176fb220ce5150ee9fe.
          Fixed for master via b216a4a0acf4e4d0463c3ed961d6a0258223491a.

          Show
          tzulitai Tzu-Li (Gordon) Tai added a comment - Fixed for 1.3.1 via 7aafaf6aa549c30257cf6176fb220ce5150ee9fe. Fixed for master via b216a4a0acf4e4d0463c3ed961d6a0258223491a.

            People

            • Assignee:
              tzulitai Tzu-Li (Gordon) Tai
              Reporter:
              tzulitai Tzu-Li (Gordon) Tai
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development