Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-6055

CoderTypeSerializer#duplicate() should create a deep copy of the coder

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Not A Problem
    • Affects Version/s: 2.8.0
    • Fix Version/s: Not applicable
    • Component/s: runner-flink
    • Labels:
      None

      Description

      I think that CoderTypeSerializer#duplicate() must make a deep copy of the coder field here https://github.com/apache/beam/blob/f2f0b02babf745d0d9645e0526637ef967dd2228/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/types/CoderTypeSerializer.java#L53. It seems to me like the coder objects can be stateful and the current implementation will share them across multiple serializer instance. Different serializer instances can be used by different threads in Flink and as a whole, this can lead to concurrency problems and corruption like in this example: https://issues.apache.org/jira/browse/FLINK-10860

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                srichter Stefan Richter
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: