Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-13787

Job in Flink + K8S Cluster exits without executing the pipeline

Details

    • Bug
    • Status: Resolved
    • P3
    • Resolution: Invalid
    • 2.34.0
    • 2.34.0
    • sdk-py-harness
    • None
    • Python version: 3.7.9
      Flink version: 1.13.5
      Beam version: 2.34.0
      Beam job server: apache/beam_flink1.13_job_server
      Kubernetes (GCP/GKE) version: 1.20.11-gke.1300

    Description

       

      I am running a beam pipeline on Flink deployed on Standalone K8S with Session Mode.

      Beam pipeline options are

      options = [
      "--runner=PortableRunner",
      "--job_endpoint=beam-job-server:8099",
      "--artifact_endpoint=beam-job-server:8098",
      "--environment_type=EXTERNAL",
      "--environment_config=localhost:50000"
      ]

      Worker pools are deployed as containers in taskmanager pod. A separate job server runs as k8S deployment with image apache/beam_flink1.13_job_server

       

      The worker pool uses a custom image built on top of apache/beam_python3.7_sdk:2.34.0 and contains user code.

       

      The client is able to successfully submit the job and it gets dispatched to worker-pool container. But the python subprocess appears to fatally crash without executing the pipeline. The pipeline status on the client side shows as DONE

      2022-02-01 00:52:11,910 None wait_until_finish_read (ProcessId : 1) INFO portable_runner.py:(576) Job state changed to DONE

       

       

      Log from worker pool container 

       
      beam-worker-pool
      2022/02/01 00:52:05 Initializing python harness: /opt/apache/beam/boot --id=1-1 --logging_endpoint=localhost:44717 --artifact_endpoint=localhost:42449 --provision_endpoint=localhost:35353 --control_endpoint=localhost:34919

       
      beam-worker-pool
      2022/02/01 00:52:05 Installing setup packages ...

       
      beam-worker-pool
      2022/02/01 00:52:05 Executing: python -m apache_beam.runners.worker.sdk_worker_main

       
      beam-worker-pool
      2022/02/01 00:52:08 Python exited: <nil>

       

       

      There is no other log statement. The last statement comes from fatal statement in boot code line 209.

      https://github.com/apache/beam/blob/v2.34.0/sdks/python/container/boot.go

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            abhinav-sarkari Abhinav Sarkari
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: