Details
-
Bug
-
Status: Resolved
-
P2
-
Resolution: Duplicate
-
2.29.0
-
None
Description
While running a job on Dataflow that writes to BigQuery using the `FILE_LOADS` write method I notice the following error in the `MultiPartitionsWriteTables` step:
{"errorResult":\{"message":"Schema update options should only be specified with WRITE_APPEND disposition, or with WRITE_TRUNCATE disposition on a table partition.","reason":"invalid"},"errors":[\{"message":"Schema update options should only be specified with WRITE_APPEND disposition, or with WRITE_TRUNCATE disposition on a table partition.","reason":"invalid"}],"state":"DONE"}
Here's the write configuration that I'm using:
BigQueryIO .write() .to(...) .withSchema(...) .withFormatFunction(...) .withCreateDisposition(CREATE_IF_NEEDED) .withWriteDisposition(WRITE_APPEND) .withSchemaUpdateOptions(Collections.singleton(SchemaUpdateOption.ALLOW_FIELD_ADDITION)) .withTimePartitioning(new TimePartitioning().setType("DAY").setRequirePartitionFilter(false).setField("ts")) .withMethod(Method.FILE_LOADS) .withTriggeringFrequency(Minutes.minutes(5).toStandardDuration) .withAutoSharding() .optimizedWrites()
I believe it is due to the fact that the schema update options are being passed to the `WriteTables` constructor for the temp tables here. It might be okay to just pass `null` there instead since I don't think we need the schema update options if we're always generating those temp tables from scratch, but I'm not sure if that will have other consequences.
This is preventing any of the load jobs from completing, causing none of the data to ever make it to the BigQuery table.
Attachments
Issue Links
- Blocked
-
BEAM-12482 BigQueryIO failed to load data to temp table when withSchemaUpdateOptions is set.
- Open
- is duplicated by
-
BEAM-12482 BigQueryIO failed to load data to temp table when withSchemaUpdateOptions is set.
- Open
- links to