Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-1251 Python 3 Support
  3. BEAM-7540

deadlock using save_main_session and logging caused by threading.RLock pickling

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • P3
    • Resolution: Unresolved
    • None
    • None
    • sdk-py-core
    • None
    • Python 3.5
      Linux
      apache-beam 2.12.0 & 2.13.0
      dill 0.2.9

    Description

      If you set save_main_session = True and have a logging.Logger instance in your _main_ module, calling a logger method after Pipeline.run has been called, the process will hang and never exit.

      Python 3 Pipeline that reproduces the error (code also available at https://gist.github.com/joar/f021db55eca4fa9e9fd7dfd67cc011b9):

      import logging
      
      import apache_beam as beam
      from apache_beam.options.pipeline_options import PipelineOptions, SetupOptions
      
      _log = logging.getLogger(__name__)
      
      
      def main(argv=None):
          logging.basicConfig(level=logging.INFO)
      
          pipeline_options = PipelineOptions(argv)
      
          setup_options = pipeline_options.view_as(SetupOptions)  # type: SetupOptions
          setup_options.save_main_session = True
      
          _log.info("Running pipeline")
      
          with beam.Pipeline(runner="DirectRunner", options=pipeline_options) as p:
              p | beam.Create(["hello", "world"]) | beam.Map(lambda x: print(x))
      
          print("""
          Call to _log.info will now deadlock, since the logging handler's
          threading.RLock() has been passed through dill.
          
          When you press Ctrl-C, the traceback should confirm that the process is 
          stuck at:
          
            File "/usr/lib/python3.5/logging/__init__.py", line 810, in acquire
              self.lock.acquire()
          """)
          _log.info("Pipeline done")
          print("Launching nukes")
      
      
      if __name__ == '__main__':
          main()
      

       I have opened an issue with dill as well: https://github.com/uqfoundation/dill/issues/321

      This issue does (sadly) not happen on Python 2.

      Just to be clear: A workaround is to not use save_main_session = True.

      Attachments

        Activity

          People

            Unassigned Unassigned
            joar Joar Wandborg
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: