Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-12449

OutOfMemoryError in Google Dataflow pipeline in 2.29.0

Details

    • Bug
    • Status: Triage Needed
    • P3
    • Resolution: Unresolved
    • 2.29.0
    • None
    • io-java-gcp
    • None

    Description

      Our pipeline reads data from PubSub and writes it to BigQuery.

      When we updated our beam version to 2.29.0 we started  having  lots of errors like

      "~ Channel ManagedChannelImpl{logId=59, target=bigquerystorage.googleapis.com:443} was not shutdown properly!!! ~

      which is already mentioned in tickets:

      https://issues.apache.org/jira/browse/BEAM-12365
      https://issues.apache.org/jira/browse/BEAM-12356

      But for us the worst thing was the fact we started having lots of OutOfMemoryError errors:

      {
        "insertId": "3218987470810364545:19344:0:33525266",
        "jsonPayload": {
          "thread": "36781",
          "job": "2021-06-02_05_56_45-14040068175423437671",
          "stage": "P2",
          "exception": "java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached\n\tat java.base/java.lang.Thread.start0(Native Method)\n\tat java.base/java.lang.Thread.start(Thread.java:803)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:937)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1005)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base/java.lang.Thread.run(Thread.java:834)\n",
          "work": "ac872063622e9bea-1fc5c3cfe1c5e604",
          "logger": "org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker",
          "worker": "text-events-processor-06020556-7d78-harness-b3z1",
          "message": "Uncaught exception in main thread. Exiting with status code 1."
        },
        "resource": {
          "type": "dataflow_step",
          "labels": {
            "project_id": "bolcom-stg-trex-c7d",
            "job_name": "text-events-processor",
            "region": "europe-west1",
            "job_id": "2021-06-02_05_56_45-14040068175423437671",
            "step_id": ""
          }
        },
        "timestamp": "2021-06-02T22:28:07.363Z",
        "severity": "ERROR",
        "labels": {
          "dataflow.googleapis.com/job_id": "2021-06-02_05_56_45-14040068175423437671",
          "compute.googleapis.com/resource_id": "3218987470810364545",
          "compute.googleapis.com/resource_name": "text-events-processor-06020556-7d78-harness-b3z1",
          "dataflow.googleapis.com/region": "europe-west1",
          "dataflow.googleapis.com/log_type": "supportability",
          "compute.googleapis.com/resource_type": "instance",
          "dataflow.googleapis.com/job_name": "text-events-processor"
        },
        "logName": "projects/bolcom-stg-trex-c7d/logs/dataflow.googleapis.com%2Fworker",
        "receiveTimestamp": "2021-06-02T22:28:09.656481597Z"
      }
      

      I suspect that these two errors are related, but I think OutOfMemory shouldn't be ignored at all.

      We had to rollback to 2.28.0. The pipeline works fine now, without any errors.

      Attachments

        Activity

          People

            Unassigned Unassigned
            Romster Nikolai Romanov
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: