Details
-
Bug
-
Status: Resolved
-
P1
-
Resolution: Cannot Reproduce
-
2.2.0
-
None
Description
I am trying to use Method.FILE_LOADS for loading data into BQ in my streaming pipeline using RC3 release of 2.2.0. I am writing to around 500 tables using DynamicDestinations and I am also using withCreateDisposition(CreateDisposition.CREATE_IF_NEEDED). Everything works fine when the first time bigquery load jobs get triggered. But on subsequent triggers pipeline throws a RuntimeException about table not found even though I created the pipeline with CreateDisposition.CREATE_IF_NEEDED. The exact exception is:
java.lang.RuntimeException: Failed to create load job with id prefix 717aed9ed1ef4aa7a616e1132f8b7f6d_a0928cae3d670b32f01ab2d9fe5cc0ee_00001_00001, reached max retries: 3, last failed load job: { "configuration" : { "load" : { "createDisposition" : "CREATE_NEVER", "destinationTable" : { "datasetId" : ..., "projectId" : ..., "tableId" : .... }, "errors" : [ } "message" : "Not found: Table ...., "reason" : "notFound" } ],
My theory is all the subsequent load jobs get trigged using CREATE_NEVER disposition and
this might be due to https://github.com/apache/beam/blob/release-2.2.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java#L140
When using DynamicDestinations all the destination tables might not be known during the first trigger and hence the pipeline's create disposition should be respected.
Attachments
Issue Links
- links to