Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Done
-
1.8.0
Description
GCS supports the resumable upload which we can use to create a Recoverable writer similar to the S3 implementation:
https://cloud.google.com/storage/docs/json_api/v1/how-tos/resumable-upload
After using the Hadoop compatible interface: https://github.com/apache/flink/pull/7519
We've noticed that the current implementation relies heavily on the renaming of the files on the commit:
https://github.com/apache/flink/blob/master/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopRecoverableFsDataOutputStream.java#L233-L259
This is suboptimal on an object store such as GCS. Therefore we would like to implement a more GCS native RecoverableWriter
Attachments
Issue Links
- causes
-
FLINK-25772 GCS filesystem fails license checker
- Closed
- is related to
-
FLINK-25577 Update GCS documentation
- Closed
- relates to
-
FLINK-19481 Add support for a flink native GCS FileSystem
- Open
- supercedes
-
FLINK-11378 Allow HadoopRecoverableWriter to write to Hadoop compatible Filesystems
- Closed
- links to