Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-12075

GroupByKey doesn't seem to work with FixedWindows in DirectRunner

Details

    Description

      After applying `FixedWindows` on a streaming source, a `GroupByKey` operation won't emit keyed elements in a window. 

      This example without `GroupByKey` prints all the windowed elements:

       

      pipeline
       .apply("ReadFromPubsub", PubsubIO.readStrings().fromSubscription(subscriptionPath))
       .apply(Window.into(FixedWindows.of(Duration.standardSeconds(5L))))
       .apply(WithKeys.of("bobcat"))
       .apply(MapElements.into(TypeDescriptors.nulls()).via(
           (KV<String, String> pair) -> {
               LOG.info("Key: " + pair.getKey() + "\tValue: " + pair.getValue());
               return null;
           }
       ));

       

      This example with `GroupByKey` doesn't emit anything:

       

      pipeline
       .apply("ReadFromPubsub", PubsubIO.readStrings().fromSubscription(subscriptionPath))
       .apply(Window.into(FixedWindows.of(Duration.standardSeconds(5L))))
       .apply(WithKeys.of("bobcat"))
       .apply(GroupByKey.create())
       .apply(FlatMapElements.into(TypeDescriptors.nulls()).via(
           (KV<String, Iterable<String>> pair) -> {
               pair.getValue().forEach(message -> LOG.info("Message: " + message));
               return null;
           }
       ));

       

      I'm using DirectRunner. The same logic works for Python using both the DirectRunner and DataflowRunner.

      Attachments

        Activity

          People

            Unassigned Unassigned
            tianzi Tianzi Cai
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: