Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-8089

Error while using customGcsTempLocation() with Dataflow

Details

    • Bug
    • Status: Open
    • P3
    • Resolution: Unresolved
    • 2.13.0
    • None
    • io-java-gcp
    • None

    Description

      I have the following code snippet which writes content to BigQuery via File Loads.

      Currently the files are being written to a GCS Bucket, but I want to write them to the local file storage of Dataflow instead and want BigQuery to load data from there.

       

       

       

      BigQueryIO
       .writeTableRows()
       .withNumFileShards(100)
       .withTriggeringFrequency(Duration.standardSeconds(90))
       .withMethod(BigQueryIO.Write.Method.FILE_LOADS)
       .withSchema(getSchema())
       .withoutValidation()
       .withCustomGcsTempLocation(new ValueProvider<String>() {
          @Override
          public String get(){
               return "/home/harshit/testFiles";     
          }
          @Override
          public boolean isAccessible(){
               return true;     
          }})
       .withTimePartitioning(new TimePartitioning().setType("DAY"))
       .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
       .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
       .to(tableName));
      

       

       

      On running this, I don't see any files being written to the provided path and the BQ load jobs fail with an IOException.

       

      I looked at the docs, but I was unable to find any working example for this.

      Attachments

        Activity

          People

            Unassigned Unassigned
            the-dagger Harshit Dwivedi
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: