Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-8222

Consider making insertId optional in BigQuery.insertAll

Details

    • New Feature
    • Status: Open
    • P3
    • Resolution: Unresolved
    • None
    • None
    • io-java-gcp
    • None

    Description

      Current implementation of StreamingWriteFn(https://github.com/apache/beam/blob/c2f0d282337f3ae0196a7717712396a5a41fdde1/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/StreamingWriteFn.java#L102) sets insertId from input element, which is added an uniqueId by https://github.com/apache/beam/blob/c2f0d282337f3ae0196a7717712396a5a41fdde1/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/TagWithUniqueIds.java#L53. Users report that if leaving insertId as empty, writing will be extremely speeded up. Can we add an bqOption like, nonInsertId and emit empty id based on this option?

      Attachments

        Activity

          People

            Unassigned Unassigned
            boyuanz Boyuan Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: