Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-8577

FileSystems may have not be initialized during ResourceId deserialization

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.16.0
    • Fix Version/s: 2.19.0
    • Component/s: runner-flink
    • Labels:
      None

      Description

      • FileSystems use static registration using FileSystems#setDefaultPipelineOptions method.
      • #setDefaultPipelineOptions is called either when deserializaing SerializablePipelineOptions or during opening of various beam operators.
      • FileIO#matchAll is expanded using Reshuffle.viaRandomKey().
      • Reshuffle is implemented using .rebalance, that doesn't have a "RichFunction" lifecycle, so we need to find another way to register FileSystems, as the deserialization may happen before other "rich operators" get executed on particular task manager.

      This results in random pipeline fails as the task assignment is not deterministic.

      We can workaround this, by registering FileSystems during coder deserialization.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                dmvk David Morávek
                Reporter:
                dmvk David Morávek
              • Votes:
                1 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2.5h
                  2.5h