Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-10872

Value Provider functionality broken in python sdk

Details

    • Important

    Description

      Whether using a custom function or an IO connector (test with `WriteToBigQuery`), the dataflow job complains that the runtime value provider `get()` is being called from a non-runtime context:

      import argparse
      import logging
      
      import apache_beam as beam
      from apache_beam.options.pipeline_options import PipelineOptions
      
      
      class ExampleDoF(beam.DoFn):
          def __init__(self, value_provider):
              self.value_provider = value_provider
      
          def process(self, el):
              logging.info(f'el: {el}')
              logging.info(f'value provider: {self.value_provider.get()}')
              yield el
      
      
      class UserOptions(PipelineOptions):
          @classmethod
          def _add_argparse_args(cls, parser):
              parser.add_value_provider_argument('--value_provider', type=int)
      
      parser = argparse.ArgumentParser()
      known_args, pipeline_args = parser.parse_known_args()
      pipeline_options = PipelineOptions(pipeline_args)
      user_options = pipeline_options.view_as(UserOptions)
      with beam.Pipeline(options=pipeline_options) as pipeline:
          results = (
                  pipeline
                  | beam.Create(['element'])
                  | beam.ParDo(ExampleDoF(user_options.value_provider))
          )
      
      
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            jkarimi Jay K

            Dates

              Created:
              Updated:

              Slack

                Issue deployment