Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-1438

The default behavior for the Write transform doesn't work well with the Dataflow streaming runner

Details

    • Bug
    • Status: Resolved
    • P3
    • Resolution: Fixed
    • None
    • 2.5.0
    • runner-dataflow
    • None

    Description

      If a Write specifies 0 output shards, that implies the runner should pick an appropriate sharding. The default behavior is to write one shard per input bundle. This works well with the Dataflow batch runner, but not with the streaming runner which produces large numbers of small bundles.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              reuvenlax Reuven Lax
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h 20m
                  4h 20m