There is an increasing need for securing data when Hadoop customers use various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so on.
HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based on HADOOP “FilterFileSystem” decorating DFS or other file systems, and transparent to upper layer applications. It’s configurable, scalable and fast.
High level requirements:
1. Transparent to and no modification required for upper layer applications.
2. “Seek”, “PositionedReadable” are supported for input stream of CFS if the wrapped file system supports them.
3. Very high performance for encryption and decryption, they will not become bottleneck.
4. Can decorate HDFS and all other file systems in Hadoop, and will not modify existing structure of file system, such as namenode and datanode structure if the wrapped file system is HDFS.
5. Admin can configure encryption policies, such as which directory will be encrypted.
6. A robust key management framework.
7. Support Pread and append operations if the wrapped file system supports them.