Details
-
Bug
-
Status: Resolved
-
P2
-
Resolution: Information Provided
-
2.3.0
-
None
-
Hardware Overview:
Model Name: MacBook Pro
Model Identifier: MacBookPro14,3
Processor Name: Intel Core i7
Processor Speed: 2.8 GHz
maven-compiler-plugin: 3.6.1
- source: 1.8
- target: 1.8
maven-shade-plugin: 3.1.0
exec-maven-plugin: 1.5.0
slf4j-api: 1.7.14
slf4j-jdk14: 1.7.14
google-cloud-dataflow-java-sdk-all: 2.3.0
google-cloud-bigquery: 0.26.0-beta
grpc-google-common-protos: 1.0.0
beam-sdks-java-io-amazon-web-services: 2.3.0 and 2.4.0Hardware Overview: Model Name: MacBook Pro Model Identifier: MacBookPro14,3 Processor Name: Intel Core i7 Processor Speed: 2.8 GHz maven-compiler-plugin: 3.6.1 - source: 1.8 - target: 1.8 maven-shade-plugin: 3.1.0 exec-maven-plugin: 1.5.0 slf4j-api: 1.7.14 slf4j-jdk14: 1.7.14 google-cloud-dataflow-java-sdk-all: 2.3.0 google-cloud-bigquery: 0.26.0-beta grpc-google-common-protos: 1.0.0 beam-sdks-java-io-amazon-web-services: 2.3.0 and 2.4.0
Description
Note:
I am sorry if it is difficult to read this report because I am not good at English.
Thank you for implementing S3FileSystem.
I tried implementing a program which performs FileIO with AWS S3 on Dataflow, and, It works.
But other Dataflow Pipeline which moved correctly until adding the SDK to dependencies has not working.
Specifically, the next log will flow after program that has not working execution starts.
Info: The AWS S3 Beam extension was included in this build, but the awsRegion flag was not specified. If you do not plan to use S3, then ignore this message. [Date]
In practice, jobs that do not end on Dataflow are created. It keeps running without spilling out errors or logs.
And, If you pass 'awsRegion' as an argument, this will works successfully. But it is a strange workaround.
This means that aws sdk is requesting the connection information to a program not accessing S3. Is not it contaminated?
As far as I've investigated, this Log seems to be spitting out in this part
https://github.com/apache/beam/blob/7fa6292a21564744011fe94a7e50f7e074564b71/sdks/java/io/amazon-web-services/src/main/java/org/apache/beam/sdk/io/aws/s3/S3FileSystem.java#L108-L112
It must pass the region as an argument?
I want you to tell me if I'm wrong. And If it is contaminated, I hope this problem will be fixed.
The version of sdks that I tried.
google-cloud-dataflow-java-sdk-all: 2.3.0
beam-sdks-java-io-amazon-web-services: 2.3.0 and 2.4.0
Thank you for reading.