Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.11.2
-
None
-
None
Description
With HADOOP-9331 and MAPREDUCE-5025 in place, MapReduce jobs have the ability to process and output the encrypted data. For pig users, to take advantage of this capability and process and output the encrypted data, pig should have capability to accept the key and pass it to the MapReduce , so that MapReduce can do the job on the behalf of pig. The scope of this Jira is limited to passing the key to MapReduce and takes the advantage of HADOOP-9331 and MAPREDUCE-5025 without breaking Pig.
To achieve that, file input formats or file output formats interface will be modified to handle CryptoCodec and set the context properly and provide key facilities.
The file [input/output] formats that does not support compression (by using CompressionCodec) can't be addressed by this work because the encryption feature (HADOOP-9331 and related) is based on CompressionCodec.
By making this change, pig can cover the following use case:
a. Pig user can run a query on an encrypted data
b. Pig users can store an encrypted data
c. Outputting the encrypted data
Accessing of encrypted HBase storage/tables or any other encrypted storage format, who pig can query, should be addressed with separate Jiras, if needed because HBase | Other systems might have specific key management mechanisms or interfacing with Pig.
To handle versions of Hadoop that do not have crypto support, we can avoid compilation problems by segregating crypto API usage into separate files to be included only if a flag is defined on the Ant command line (something like –Dcrypto).
Attachments
Attachments
Issue Links
- requires
-
AVRO-1372 Avro file data encryption for Java
- Open
-
MAPREDUCE-5025 Key Distribution and Management for supporting crypto codec in Map Reduce
- Open
-
HADOOP-9331 Hadoop crypto codec framework and crypto codec implementations
- Open
-
HADOOP-9996 Improve TFile format to support any compression codecs
- Open
-
HADOOP-9997 Improve TFile API to be able to pass the context for encryption codecs
- Open