Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-3530

DoFn.process should raise exception if something other than a List is returned

Details

    • Improvement
    • Status: Open
    • P3
    • Resolution: Unresolved
    • None
    • None
    • sdk-py-core

    Description

      The process method of DoFns can either return values or yield values. In the case of returning values, it expects a List of elements to be returned. When returning a single value, it is easy to forget this, and return the value instead.

      Correct way:

      class SomeDoFn(beam.DoFn)
        def process(self, elem):
          return ['a']

      Incorrect way:

      class SomeDoFn(beam.DoFn)
        def process(self, elem):
          return 'a'

      A pipeline with the incorrect DoFn will fail will a cryptic error message without a direct indication that the actual error is due to SomeDoFn returning an element instead of a List containing that element. This issue is very time-consuming to track down.

      It would be good if the pipeline could raise an exception or otherwise indicate that the DoFn is incorrectly returning an element instead of a List to make it easier to identify the error.

      Attachments

        Activity

          People

            Unassigned Unassigned
            chuanyu Chuan Yu Foo
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: