Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-14206

Direct Runner Does Not Work As Well As Google Dataflow

Details

    • Bug
    • Status: Open
    • P2
    • Resolution: Unresolved
    • 2.37.0
    • None
    • io-java-gcp

    Description

      As a Staff Software Engineer at Forgerock developing Beam applications,
      I want Direct Runner to work the same as Google Dataflow with Flex Templates,
      So that Beam applications can be developed and tested locally with a good developer experience.

      This is a high-level report that may need to be broken down into more specific reports.

      var stream = pipeline // Extract - Transform - Load, Lather Rinse Repeat
          //.apply("Extract from Source",   PubsubIO.readStrings().fromSubscription(inputSubscription).withTimestampAttribute("date"))
          .apply("Extract from Source",   PubsubIO.readStrings().fromSubscription(inputSubscription))
          .apply("Transform to TableRow", ParDo.of(new TransformToTableRow()))
          .apply("Define Window",         Window.<TableRow>into(FixedWindows.of(Duration.standardMinutes(1))).withAllowedLateness(Duration.standardMinutes(10)).accumulatingFiredPanes())
      
          .apply("Transform to JSON",     MapElements.into(TypeDescriptors.strings()).via(GenericJson::toString));
      
      stream.apply("Load to PubSub",      PubsubIO.writeStrings().to(sinkTopic));
      
      stream.apply("Load to GCS",         TextIO.write().to(sinkUri).withWindowedWrites().withNumShards(1));
      

      This code works as expected running under Google Dataflow as a Flex Template.

      When running locally under Direct Runner there are two problems

      1. .withTimestampAttribute("date") fails to find the "date" attribute
        • Exception in thread "main" java.lang.IllegalArgumentException: PubSub message is missing a value for timestamp attribute date
      2. No results are written to GCS
        • However, results are written to the local file system as temporary files
        • Results are written to PubSub correctly

      More context can be provided on request...

       

      Attachments

        1. streaming-pubsub-poc.zip
          1.02 MB
          Eric Kolotyluk

        Activity

          People

            Unassigned Unassigned
            kolotyluk Eric Kolotyluk
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: