Details
-
Bug
-
Status: Triage Needed
-
P2
-
Resolution: Fixed
-
2.16.0
-
None
Description
- FileSystems use static registration using FileSystems#setDefaultPipelineOptions method.
- #setDefaultPipelineOptions is called either when deserializaing SerializablePipelineOptions or during opening of various beam operators.
- FileIO#matchAll is expanded using Reshuffle.viaRandomKey().
- Reshuffle is implemented using .rebalance, that doesn't have a "RichFunction" lifecycle, so we need to find another way to register FileSystems, as the deserialization may happen before other "rich operators" get executed on particular task manager.
This results in random pipeline fails as the task assignment is not deterministic.
We can workaround this, by registering FileSystems during coder deserialization.