Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-7822

TriggerCopyJobs in BQ file loads is not atmomic in case of failure

Details

    • Test
    • Status: Resolved
    • P3
    • Resolution: Fixed
    • 2.13.0
    • 2.16.0
    • io-py-gcp
    • None

    Description

      Scenario:
      If temp_tables are being used, during copying data from the temp table to the destination table, if there is a failure in BigQuery mid-way through execution, it will raise an Exception causing the pipeline to fail. As a result, some temp_tables will be copied and some will not be. When the pipeline is rerun, it will cause the same data to be written to new temp_tables and copy jobs will be triggered to copy this data to the destination table.

      This will result in duplicate data being written to the BigQuery destination table.

      Attachments

        Activity

          People

            ttanay Tanay Tummalapalli
            ttanay Tanay Tummalapalli
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: