Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-804

Python Pipeline Option save_main_session non-functional

Details

    • Bug
    • Status: Resolved
    • P1
    • Resolution: Fixed
    • None
    • Not applicable
    • sdk-py-core
    • None
    • OSX El Capitan, google-cloud-dataflow==0.4.3, python 2.7.12

    Description

      When trying to use the option --save_main_session a pickling error occurs.

      pickle.PicklingError: Can't pickle <class 'apache_beam.internal.clients.dataflow.dataflow_v1b3_messages.TypeValueValuesEnum'>: it's not found as apache_beam.internal.clients.dataflow.dataflow_v1b3_messages.TypeValueValuesEnu

      This prevents the use of this option which is desirable as there is an expensive object that needs to be created on each worker in my pipeline and I would like to have this object created only once per worker. It is not practical to have it inline with the ParDo function unless I make the batch size sent to the ParDo quite large. Doing this seems to lead to idle workers and I would ideally want to bring the batch size way down.

      The "Affects Version" option above doesn't have a 0.4.3 version in the drop down so I did not populate it. However, this was a problem with 0.4.1 and has not been corrected with 0.4.3.

      I don't see where I can attach a file, so here is the entire error.

      2016-10-24 10:00:16,071 <module> The oauth2client.contrib.multistore_file module has been deprecated and will be removed in the next release of oauth2client. Please migrate to multiprocess_file_storage.
      2016-10-24 10:00:16,127 _init_ Direct usage of TextFileSink is deprecated. Please use 'textio.WriteToText()' instead of directly instantiating a TextFileSink object.
      Traceback (most recent call last):
      File "00test.py", line 41, in <module>
      p.run()
      File "/usr/local/lib/python2.7/site-packages/apache_beam/pipeline.py", line 159, in run
      return self.runner.run(self)
      File "/usr/local/lib/python2.7/site-packages/apache_beam/runners/dataflow_runner.py", line 172, in run
      self.dataflow_client.create_job(self.job))
      File "/usr/local/lib/python2.7/site-packages/apache_beam/utils/retry.py", line 160, in wrapper
      return fun(*args, **kwargs)
      File "/usr/local/lib/python2.7/site-packages/apache_beam/internal/apiclient.py", line 375, in create_job
      job.options, file_copy=self._gcs_file_copy)
      File "/usr/local/lib/python2.7/site-packages/apache_beam/utils/dependency.py", line 325, in stage_job_resources
      pickler.dump_session(pickled_session_file)
      File "/usr/local/lib/python2.7/site-packages/apache_beam/internal/pickler.py", line 204, in dump_session
      return dill.dump_session(file_path)
      File "/usr/local/lib/python2.7/site-packages/dill/dill.py", line 333, in dump_session
      pickler.dump(main)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump
      self.save(obj)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
      f(self, obj) # Call unbound method with explicit self
      File "/usr/local/lib/python2.7/site-packages/apache_beam/internal/pickler.py", line 123, in save_module
      return old_save_module(pickler, obj)
      File "/usr/local/lib/python2.7/site-packages/dill/dill.py", line 1168, in save_module
      state=_main_dict)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 425, in save_reduce
      save(state)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
      f(self, obj) # Call unbound method with explicit self
      File "/usr/local/lib/python2.7/site-packages/apache_beam/internal/pickler.py", line 159, in new_save_module_dict
      return old_save_module_dict(pickler, obj)
      File "/usr/local/lib/python2.7/site-packages/dill/dill.py", line 835, in save_module_dict
      StockPickler.save_dict(pickler, obj)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 655, in save_dict
      self._batch_setitems(obj.iteritems())
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 687, in _batch_setitems
      save(v)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 331, in save
      self.save_reduce(obj=obj, *rv)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 425, in save_reduce
      save(state)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
      f(self, obj) # Call unbound method with explicit self
      File "/usr/local/lib/python2.7/site-packages/apache_beam/internal/pickler.py", line 159, in new_save_module_dict
      return old_save_module_dict(pickler, obj)
      File "/usr/local/lib/python2.7/site-packages/dill/dill.py", line 835, in save_module_dict
      StockPickler.save_dict(pickler, obj)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 655, in save_dict
      self._batch_setitems(obj.iteritems())
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 687, in _batch_setitems
      save(v)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 331, in save
      self.save_reduce(obj=obj, *rv)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 425, in save_reduce
      save(state)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
      f(self, obj) # Call unbound method with explicit self
      File "/usr/local/lib/python2.7/site-packages/apache_beam/internal/pickler.py", line 159, in new_save_module_dict
      return old_save_module_dict(pickler, obj)
      File "/usr/local/lib/python2.7/site-packages/dill/dill.py", line 835, in save_module_dict
      StockPickler.save_dict(pickler, obj)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 655, in save_dict
      self._batch_setitems(obj.iteritems())
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 687, in _batch_setitems
      save(v)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 331, in save
      self.save_reduce(obj=obj, *rv)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 425, in save_reduce
      save(state)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
      f(self, obj) # Call unbound method with explicit self
      File "/usr/local/lib/python2.7/site-packages/apache_beam/internal/pickler.py", line 159, in new_save_module_dict
      return old_save_module_dict(pickler, obj)
      File "/usr/local/lib/python2.7/site-packages/dill/dill.py", line 835, in save_module_dict
      StockPickler.save_dict(pickler, obj)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 655, in save_dict
      self._batch_setitems(obj.iteritems())
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 687, in _batch_setitems
      save(v)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 331, in save
      self.save_reduce(obj=obj, *rv)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 425, in save_reduce
      save(state)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
      f(self, obj) # Call unbound method with explicit self
      File "/usr/local/lib/python2.7/site-packages/apache_beam/internal/pickler.py", line 159, in new_save_module_dict
      return old_save_module_dict(pickler, obj)
      File "/usr/local/lib/python2.7/site-packages/dill/dill.py", line 835, in save_module_dict
      StockPickler.save_dict(pickler, obj)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 655, in save_dict
      self._batch_setitems(obj.iteritems())
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 687, in _batch_setitems
      save(v)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
      f(self, obj) # Call unbound method with explicit self
      File "/usr/local/lib/python2.7/site-packages/apache_beam/internal/pickler.py", line 159, in new_save_module_dict
      return old_save_module_dict(pickler, obj)
      File "/usr/local/lib/python2.7/site-packages/dill/dill.py", line 835, in save_module_dict
      StockPickler.save_dict(pickler, obj)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 655, in save_dict
      self._batch_setitems(obj.iteritems())
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 687, in _batch_setitems
      save(v)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 331, in save
      self.save_reduce(obj=obj, *rv)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 400, in save_reduce
      save(func)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
      f(self, obj) # Call unbound method with explicit self
      File "/usr/local/lib/python2.7/site-packages/dill/dill.py", line 1231, in save_type
      StockPickler.save_global(pickler, obj)
      File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 754, in save_global
      (obj, module, name))
      pickle.PicklingError: Can't pickle <class 'apache_beam.internal.clients.dataflow.dataflow_v1b3_messages.TypeValueValuesEnum'>: it's not found as apache_beam.internal.clients.dataflow.dataflow_v1b3_messages.TypeValueValuesEnum

      Attachments

        Activity

          People

            sb2nov Sourabh Bajaj
            zoranbujila Zoran Bujila
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: