Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-1438

The default behavior for the Write transform doesn't work well with the Dataflow streaming runner

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: P3
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 2.5.0
    • Component/s: runner-dataflow
    • Labels:
      None

      Description

      If a Write specifies 0 output shards, that implies the runner should pick an appropriate sharding. The default behavior is to write one shard per input bundle. This works well with the Dataflow batch runner, but not with the streaming runner which produces large numbers of small bundles.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                reuvenlax Reuven Lax
              • Votes:
                0 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h 20m
                  4h 20m