Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-26992

Passing directly between threads PojoSerializer may cause ConcurrentModificationException

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.16.0
    • None
    • None

    Description

      Extract from FLINK-26835

      While investigating this issue, we have found that probably state backends are also using non-thread safe serialisers from different threads.

      For example: RocksFullSnapshotStrategy#syncPrepareResources is passing keySerializer from the task thread, to the async thread in order to serialize the serializer itself. RocksIncrementalSnapshotStrategy.RocksDBIncrementalSnapshotOperation#materializeMetaData seems to be doing the same thing. If PojoSerializer is used as keySerializer I think this will lead to the same problems as above. Iterating through the PojoSerializer#subclassSerializerCache from the the async checkpoint thread, while the map can be changed from the task thread. It looks like in all of those places the serializer should have been duplicated (#duplicate) before being passed to another thread. Maybe this should happen in RocksDBSnapshotStrategyBase. I don't know about other state backends.

       

      =======

      TODO:

      I second that each thread obtains ownership of the Serializer passed in itself.

      • Figure out whether the current way of passing Serializer really causing problems in the state backend (concurrent modification possible).
      • What other places have Serializer directly passed between threads.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ym Yuan Mei
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: