Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-384

Streaming BigQueryIO should support user-provided IDs

Details

    • Improvement
    • Status: Open
    • P3
    • Resolution: Unresolved
    • 0.1.0-incubating, 0.2.0-incubating
    • None
    • io-java-gcp
    • None

    Description

      Currently, BigQueryIO always assigns IDs and does a shuffle to ensure that they are atomic. This incurs a noticeable cost and is unnecessary if the user already has deterministic IDs that they can use. The sink should be able to use these IDs to skip the shuffle.

      Attachments

        Activity

          People

            Unassigned Unassigned
            millsd@google.com Daniel Mills
            Votes:
            2 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: