Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-2761

Write to empty BigQuery partition fails with "No schema specified on job or table." despite having provided schema

Details

    • Bug
    • Status: Resolved
    • P3
    • Resolution: Cannot Reproduce
    • 2.1.0, 2.2.0
    • 2.2.0
    • runner-dataflow
    • None

    Description

      In 2.1.0-SNAPSHOT and 2.2.0-SNAPSHOT, jobs writing an empty PCollection to a BigQuery partition fail with "java.lang.RuntimeException: Failed to create load job with id prefix". This is associated with a message "No schema specified on job or table" even though a schema is provided. See attached stack trace for the more detail on the error.

      Command to run job:

      mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.EmptyPCollection \
           -Dexec.args="--runner=DataflowRunner --project=<GCP project> \
                        --gcpTempLocation=<tmp location>" \
           -Pdataflow-runner
      

      Code to reproduce the problem:

      EmptyPCollection.java
      public class EmptyPCollection {
      
        public static void main(String[] args) {
      
          PipelineOptions options = PipelineOptionsFactory.fromArgs(args).create();
          options.setTempLocation("<your tmp location>");
          Pipeline pipeline = Pipeline.create(options);
      
          String schema = "{\"fields\": [{\"name\": \"pet\", \"type\": \"string\", \"mode\": \"required\"}]}";
          String table = "mydataset.pets";
          List<String> pets = Arrays.asList("Dog", "Cat", "Goldfish");
          PCollection<String> inputText = pipeline.apply(Create.of(pets)).setCoder(StringUtf8Coder.of());
          PCollection<TableRow> rows = inputText.apply(ParDo.of(new DoFn<String, TableRow>() {
            @ProcessElement
            public void processElement(ProcessContext c) {
              String text = c.element();
              if (text.startsWith("X")) {  // change to (D)og and works fine
                TableRow row = new TableRow();
                row.set("pet", text);
                c.output(row);
              }
            }
          }));
      
          rows.apply(BigQueryIO.writeTableRows().to(table).withJsonSchema(schema)
              .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
              .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED));
      
          pipeline.run().waitUntilFinish();
      
        }
      }
      

      Attachments

        Activity

          People

            reuvenlax Reuven Lax
            fallon Fallon
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: