Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-10793

Incorrect Flink runner documentation

Details

    • Bug
    • Status: Open
    • P3
    • Resolution: Unresolved
    • None
    • None
    • runner-flink, sdk-go
    • None

    Description

      As per the documentation at https://beam.apache.org/documentation/runners/flink/ under "Portable (Java/Python/Go)", a containerized flink job server needs to be started using

      docker run --net=host apache/beam_flink1.10_job_server:latest
      

      or

      docker run --net=host apache/beam_flink1.10_job_server:latest --flink-master=localhost:8081
      

       If any of the SDKs are run using the DOCKER flag, all crash. As explained by danoliveira"This command is building and running it locally on your machine. I'm not 100% sure why running it in a container is causing the error, but my suspicion is that it has to do with writing the manifest/artifact files to disk. One thing the job server does is writing artifacts to disk and then sending the locations to the SDK harness so it can read them. If the job server is in a container, then its probably writing the files to the container instead of your local machine, so they're inaccessible to the SDK harness." In fact, lostluck tracked this to an already existing issue https://issues.apache.org/jira/browse/BEAM-5273 which is yet to be resolved and addresses this exact problem. Using Daniel's advice, Go SDK (and others I'm certain) can be run in DOCKER mode if the flink job server is started locally using gradle as follows –

      ./gradlew :runners:flink:1.10:job-server:runShadow -Djob-host=localhost -Dflink-master=local

      Only if the SDK is run using the LOOPBACK flag does it manage to run on a containerized flink cluster. Moreoever since the LOOPBACK flag is explicitly meant for local development purposes only, this makes me wonder how folks are deploying their production beam data pipelines on flink (especially on managed services like Kubernetes). Overall, the main issue (at least until BEAM-5273 is unresolved) is the fact that beam's documentation fails to mention these caveats explicitly.

      Attachments

        Activity

          People

            Unassigned Unassigned
            kevinsijo Kevin Sijo Puthusseri
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: