Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-9487

GBKs on unbounded pcolls with global windows and no triggers should fail

    XMLWordPrintableJSON

    Details

      Description

      This, according to "4.2.2.1 GroupByKey and unbounded PCollections" in https://beam.apache.org/documentation/programming-guide/.

      If you do apply GroupByKey or CoGroupByKey to a group of unbounded PCollections without setting either a non-global windowing strategy, a trigger strategy, or both for each collection, Beam generates an IllegalStateException error at pipeline construction time.

      Example where this doesn't happen in Python SDK: https://stackoverflow.com/questions/60623246/merge-pcollection-with-apache-beam

      I also believe that this unit test should fail, since test_stream is unbounded, uses global window, and has no triggers.

        def test_global_window_gbk_fail(self):
          with TestPipeline() as p:
            test_stream = TestStream()
            _ = p | test_stream | GroupByKey()
      

        Attachments

          Activity

            People

            • Assignee:
              zhoufek Zachary Houfek
              Reporter:
              udim Udi Meiri
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 18h 20m
                18h 20m