[BEAM-8660] Override returned artifact staging endpoint - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Triage Needed
Priority: P3
Resolution: Fixed
Affects Version/s: None
Fix Version/s: Not applicable
Component/s: runner-flink
Labels:
- portability-flink

Description

When running Beam Python pipelines on Flink/Spark/etc, we connect the SDK to the job server using the job_endpoint option. The job server then returns the address of the artifact staging endpoint to the SDK.

This is problematic when running the job server in network environments where the job server is not aware of its external hostname, for example Kubernetes. In this case, the job server will return something like localhost:8098, which might not be correct. While we do have a --job-host option, this is used both internally and externally, and the internal and external host names may not be the same.

One solution would be to configure two separate host names in the job server. However I do not prefer this option because of the complexity it adds.

The more straightforward solution is to add an option to Python that overrides the artifact staging endpoint returned by the server.

Attachments

Issue Links

links to

GitHub Pull Request #10108

GitHub Pull Request #12905

Activity

People

Assignee:: Unassigned

Reporter:: Kyle Weaver

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 13/Nov/19 23:58

Updated:: 13/Apr/23 10:58

Resolved:: 05/May/21 18:42

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

4.5h