Details
-
Improvement
-
Status: Triage Needed
-
P2
-
Resolution: Fixed
-
None
Description
The Java SDK supports a bunch of methods for writing data into BigQuery, while the Python SDK supports the following:
- Streaming inserts for streaming pipelines As seen in [bigquery.py and BigQueryWriteFn
- File loads for batch pipelines As implemented in [PR 7655
Qucik and dirty early design doc: https://s.apache.org/beam-bqfl-py-streaming
The Java SDK also supports File Loads for Streaming pipelines see BatchLoads application.
File loads have the advantage of being much cheaper than streaming inserts (although they also are slower for the records to show up in the table).