Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-5488

DirectRunner not producing output on TextIO withWindowedWrites() and withNumShards(1)

Details

    Description

      Source of bug (Slack userĀ https://the-asf.slack.com/team/UCVN8DK7V) andĀ https://stackoverflow.com/questions/52445414/apache-beam-not-saving-unbounded-data-to-text-file.

      Example provided:

      public static void main(String[] args) {
          ExerciseOptions options = PipelineOptionsFactory.fromArgs(args).withValidation().as(ExerciseOptions.class);
      
          Pipeline pipeline = Pipeline.create(options);
      
          pipeline
            .apply("Read Messages from Pubsub",
              PubsubIO
                .readStrings()
                .fromTopic(options.getTopicName()))
      
            .apply("Set event timestamp", ParDo.of(new DoFn<String, String>() {
              @ProcessElement
              public void processElement(ProcessContext context) {
                context.outputWithTimestamp(context.element(), Instant.now());
              }
            }))
      
            .apply("Windowing", Window.into(FixedWindows.of(Duration.standardMinutes(5))))
      
            .apply("Write to File",
              TextIO
                .write()
                .withWindowedWrites()
                .withNumShards(1)
                .to(options.getOutputPrefix()));
      
          pipeline.run();
        }
      

      Produces output when executed on the DataflowRunner, does not produce output on the DirectRunner.

      Attachments

        Activity

          People

            Unassigned Unassigned
            lcwik Luke Cwik

            Dates

              Created:
              Updated:

              Slack

                Issue deployment