Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-3067

BigQueryIO.Write fails on empty PCollection with DirectRunner (batch job)

Details

    • Bug
    • Status: Resolved
    • P2
    • Resolution: Fixed
    • 2.1.0
    • 2.2.0
    • io-java-gcp, runner-direct
    • None
    • Arch Linux, Java 1.8.0_144

    Description

      I'm using side output feature to filter out malformatted events (errors) from a stream of valid events. Then I save valid events into one BigQuery table and errors go into another dedicated table.
      Here is the code for outputting error rows:

      invalidEventRows.apply("WriteErrors", BigQueryIO.writeTableRows()
              .to(errorTableRef)
              .withSchema(ProcessEvents.getErrorSchema())
              .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
              .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND));
      

      The problem is that when running on DirectRunner in a batch mode (reading input from a file) and invalidEventRows PCollection ends up being empty (all events are valid – no errors), I get the following error:

      [ERROR]   "status" : {
      [ERROR]     "errorResult" : {
      [ERROR]       "message" : "No schema specified on job or table.",
      [ERROR]       "reason" : "invalid"
      [ERROR]     },
      [ERROR]     "errors" : [ {
      [ERROR]       "message" : "No schema specified on job or table.",
      [ERROR]       "reason" : "invalid"
      [ERROR]     } ],
      [ERROR]     "state" : "DONE"
      [ERROR]   },
      

      There are no errors when executing the same code and invalidEventRows PCollection is not empty, the BigQuery table is created and the data are correctly inserted.
      Also everything seems to be working fine in a streaming mode (reading from Pub/Sub) on both DirectRunner and DataflowRunner.

      Looks like a bug?
      Or should I open an issue in GoogleCloudPlatform/DataflowJavaSDK github project?

      Attachments

        Activity

          People

            reuvenlax Reuven Lax
            bigunyak Dmitry Bigunyak
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: