I have two Projects on the Google Cloud Platform
1) Service Project for my Dataflow jobs
2) Host Project for Shared VPC & Subnetworks
The Host Project has configured Firewall Rules for the Dataflow job. ie. allow all traffic, allow all internal traffic, allow all traffic tagged with 'dataflow' etc
The job will hang when performing shuffle operations. I will also see the following warning:
1. Only passing "subnetwork" arg without "network" but that only modifies the warning to state "default" instead of "miles-qa-vpc", which sounds like a logging error to me.
2. Firewall rules have been configured to:
- allow all traffic
- allow all internal traffic
- allow all traffic with the source tag 'dataflow'
- allow all traffic with the target tag 'dataflow'
3. Service Account has been configured to have Compute Network User permissions in both projects.
4. Ensured subnetwork is in the same region as the job.
5. Network in the service project is happily serving a dedicated cluster for other purposes in the host project.
It genuinely seems like the spawned Compute Instances are not gaining the configuration.
I expect the Dataflow job not to report the firewall issue and successfully deal with shuffling (GroupBys etc.)