Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
1.14.3
-
None
Description
We have a security requirement to client side encrypt flink state for certain flink applications that process sensitive data.
Currently, there is no feature that supports this out of the box on AWS S3 backend.
We found that one way to do it is to use flink-s3-fs-hadoop compiled against hadoop 3.3.2 for checkpoints as hadoop 3.3.2 provides out of the box AWS client side encryption using AWS KMS keys before writing the data to S3. (https://issues.apache.org/jira/browse/HADOOP-13887)
We were able to change the flink-filesystems shaded hadoop version from existing 3.2.2 version to version 3.3.2 and compile with minimal code changes. The resultant flink-s3-fs-hadoop jar was used in the checkpoint plugin path for our flink jobs and worked well for checkpoints/savepoints upto 250 GB each with client side encryption using AWS KMS.
Filing this Jira to request to take these changes upstream and also to check if there are concerns with changing the hadoop version that may affect any other components since our observations have been limited to plugin jar and checkpoints using flink-s3-fs-hadoop filesystem.
Attachments
Issue Links
- duplicates
-
FLINK-27308 Update the Hadoop implementation for filesystems to 3.3.2
- Closed
- links to