Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-9509

Subprocess job server treats missing local file as remote URL

Details

    Description

      When the job server jar requested (e.g. by portableWordCountSparkRunnerBatch) is missing (such as when it hasn't yet been built), the error message is misleading. Expected behavior is that the jar is recognized as a local file, and a message is printed instructing the user to build it.

      INFO:apache_beam.utils.subprocess_server:Downloading job server jar from /usr/local/google/home/kcweaver/go/src/github.com/apache/beam/runners/spark/job-server/build/libs/beam-runners-spark-job-server-2.21.0-SNAPSHOT.jar
      Traceback (most recent call last):
      File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
      "_main_", mod_spec)
      File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
      exec(code, run_globals)
      File "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/examples/wordcount.py", line 142, in <module>
      run()
      File "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/examples/wordcount.py", line 121, in run
      result = p.run()
      File "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/pipeline.py", line 495, in run
      self._options).run(False)
      File "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/pipeline.py", line 508, in run
      return self.runner.run_pipeline(self, self._options)
      File "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/spark_runner.py", line 45, in run_pipeline
      return super(SparkRunner, self).run_pipeline(pipeline, options)
      File "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/portable_runner.py", line 386, in run_pipeline
      job_service_handle = self.create_job_service(options)
      File "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/portable_runner.py", line 293, in create_job_service
      return JobServiceHandle(server.start(), options)
      File "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/job_server.py", line 86, in start
      self._endpoint = self._job_server.start()
      File "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/job_server.py", line 111, in start
      cmd, endpoint = self.subprocess_cmd_and_endpoint()
      File "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/job_server.py", line 156, in subprocess_cmd_and_endpoint
      jar_path = self.local_jar(self.path_to_jar())
      File "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/job_server.py", line 153, in local_jar
      return subprocess_server.JavaJarServer.local_jar(url)
      File "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/utils/subprocess_server.py", line 206, in local_jar
      url_read = urlopen(url)
      File "/usr/lib/python3.7/urllib/request.py", line 222, in urlopen
      return opener.open(url, data, timeout)
      File "/usr/lib/python3.7/urllib/request.py", line 510, in open
      req = Request(fullurl, data)
      File "/usr/lib/python3.7/urllib/request.py", line 328, in _init_
      self.full_url = url
      File "/usr/lib/python3.7/urllib/request.py", line 354, in full_url
      self._parse()
      File "/usr/lib/python3.7/urllib/request.py", line 383, in _parse
      raise ValueError("unknown url type: %r" % self.full_url)
      ValueError: unknown url type: '/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/runners/spark/job-server/build/libs/beam-runners-spark-job-server-2.21.0-SNAPSHOT.jar'

      Attachments

        Issue Links

          Activity

            People

              ibzib Kyle Weaver
              ibzib Kyle Weaver
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m