Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-4649

Failures in beam_PostCommit_Py_ValCont due to exception in read_log_control_messages

Details

    • Bug
    • Status: Resolved
    • P2
    • Resolution: Fixed
    • 2.6.0
    • 2.6.0
    • sdk-py-harness
    • None

    Description

      Example

      https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-06-26_11_04_04-14383422363721841333?project=apache-beam-testing

       

      All of the failures have the same exception in the docker logs:
      I  2018/06/26 18:05:12 Executing: python -m apache_beam.runners.worker.sdk_worker_main
      I  Exception in thread read_log_control_messages:
      I  Traceback (most recent call last):
      I    File "/usr/local/lib/python2.7/threading.py", line 801, in __bootstrap_inner
      I      self.run()
      I    File "/usr/local/lib/python2.7/threading.py", line 754, in run
      I      self._target(*self.args, **self._kwargs)
      I    File "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/log_handler.py", line 61, in <lambda>
      I      target=lambda: self._read_log_control_messages(log_control_messages),
      I    File "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/log_handler.py", line 107, in _read_log_control_messages
      I      for _ in log_control_iterator:
      I    File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 344, in next
      I      return self._next()
      I    File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 335, in _next
      I      raise self
      I  _Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.UNAVAILABLE, Connect Failed)>
      I  
      I  Traceback (most recent call last):
      I    File "/usr/local/lib/python2.7/runpy.py", line 174, in _run_module_as_main
      I      "_main_", fname, loader, pkg_name)
      I    File "/usr/local/lib/python2.7/runpy.py", line 72, in _run_code
      I      exec code in run_globals
      I    File "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker_main.py", line 195, in <module>
      I      main(sys.argv)
      I    File "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker_main.py", line 134, in main
      I      worker_count=_get_worker_count(sdk_pipeline_options)).run()
      I    File "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 104, in run
      I      for work_request in control_stub.Control(get_responses()):
      I    File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 344, in next
      I      return self._next()
      I    File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 324, in _next
      I      raise self
      I  grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.UNAVAILABLE, Connect Failed)>

      Looks like a race condition?
      Harness log with startup message is appearing after docker log with connection exception.

      Harness log:
      2018-06-26 10:40:59.328 PDT Launched Beam Fn Logging service url: "localhost:12370"
      Docker log:
      2018-06-26 10:40:53.361 PDT Exception in thread read_log_control_messages:

      Attachments

        Activity

          People

            alanmyrvold Alan Myrvold
            alanmyrvold Alan Myrvold
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 20m
                1h 20m