Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-8998

Avoid excessive bundle progress polling in Dataflow Runner

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Triage Needed
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: runner-dataflow
    • Labels:
      None

      Description

      Dataflow Java runner uses 0.1 secs interval for polling bundle progress from SDK Harness, and use the result to decide whether data transfer should be throttled. This can potentially overload SDK Harness. 

      We should try to come up with a way to avoid the throttling and lower the bundle progress request frequency significantly.

       

      Code reference:

      frequency setting: https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/fn/control/BeamFnMapTaskExecutor.java#L296

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                yichi Yichi Zhang
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: