Details
-
Bug
-
Status: Resolved
-
P3
-
Resolution: Fixed
-
2.30.0
-
None
Description
When multiple load jobs are needed to write data to a destination table, e.g., when the data is spread over more than 10,000 URIs, WriteToBigQuery in FILE_LOADS mode will write data into temporary tables and then update the temporary tables if schema additions is allowed.
However, update of temporary table scheme does not respect a specified source format of the loading files(i.e. JSON, AVRO).
The UpdateDestinationSchema issue schema modification command with a default CSV setting which causing AVRO or JSON nested schema loads to fail with the error:
apache_beam.io.gcp.bigquery_file_loads: INFO: Triggering schema modification job beam_bq_job_LOAD_satybald7_SCHEMA_MOD_STEP_994_3869e4dc1dd08c68d20fd047e242161a_7c553f684cce4963a75d669f38a4ec46 on <TableReference datasetId: 'python_write_to_table_1627431111435' projectId: 'DELETED' tableId: 'python_append_schema_update'> apache_beam.io.gcp.bigquery_tools: INFO: Failed to insert job <JobReference jobId: 'beam_bq_job_LOAD7_SCHEMA_MOD_STEP_994_3869e4dc1dd08c68d20fd047e242161a_7c553f684cce4963a75d669f38a4ec46' projectId: 'DELETED'>: HttpError accessing .... 'content-type': 'application/json; charset=UTF-8', 'content-length': '332', 'date': 'Wed, 28 Jul 2021 00:12:03 GMT', 'server': 'UploadServer', 'status': '400'}>, content <{ "error": { "code": 400, "message": "Cannot load CSV data with a nested schema. Field: nested_field", "errors": [ { "message": "Cannot load CSV data with a nested schema. Field: nested_field", "domain": "global", "reason": "invalid" } ], "status": "INVALID_ARGUMENT" } }
Attachments
Issue Links
- links to