[ORC-14] Add column level encryption to ORC files - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.6.0
Component/s: None
Labels:
None

Description

It would be useful to support column level encryption in ORC files. Since each column and its associated index is stored separately, encrypting a column separately isn't difficult. In terms of key distribution, it would make sense to use an external server like the one in HADOOP-9331.

Attachments

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

columnEncryption.png
13/Jan/18 00:15
93 kB
Owen O'Malley

Issue Links

is a parent of

HIVE-21848 Table property name definition between ORC and Parquet encrytion

Resolved

is related to

HADOOP-9331 Hadoop crypto codec framework and crypto codec implementations

Open

relates to

HADOOP-9534 Credential Management Framework (CMF)

Resolved

Sub-Tasks

1.

Create framework for data masking.

Closed

Owen O'Malley

2.

Create sha256 mask

Closed

Sandeep More

3.

Modify InStream and OutStream to optionally encrypt data

Closed

Owen O'Malley

4.

Add support for Key Management Servers (kms) to HadoopShims

Closed

Owen O'Malley

5.

Add unmasked ranges option for redact mask

Closed

Sandeep More

6.

Create in memory KeyProvider class

Closed

Sandeep More

7.

Write documentation for column encryption

Open

Unassigned

8.

Change HadoopShims.KeyProvider to separate createLocalKey and decryptLocalKey

Closed

Owen O'Malley

9.

Separate the compression options from the CompressionCodec

Closed

Owen O'Malley

10.

Extend specification and protobuf definition with column encryption

Closed

Owen O'Malley

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

11.

Extend physicalwriter for encryption

Closed

Owen O'Malley

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

12.

Add the API changes for getting column encryption

Closed

Owen O'Malley

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

13.

Refactor the TreeWriter and WriterContext APIs so that TreeWriters can deal with encryption

Closed

Owen O'Malley

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

14.

Extend the stripe read planner to understand encryption

Closed

Owen O'Malley

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

15.

Update the KeyProvider to match spec

Closed

Owen O'Malley

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

16.

Cleanup API for StreamOptions and CompressionCodec.Options

Closed

Owen O'Malley

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

17.

Modify InStream for column encryption

Closed

Owen O'Malley

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

18.

Fix file merging for column encryption.

Closed

Owen O'Malley

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

19.

Update metadata tools to print encryption information.

Open

Owen O'Malley

20.

Update ReaderImpl to support column encryption

Closed

Owen O'Malley

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

21.

Add support for table properties to control column encryption

Closed

Owen O'Malley

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

Activity

People

Assignee:: Owen O'Malley

Reporter:: Owen O'Malley

Votes:: 4 Vote for this issue

Watchers:: 31 Start watching this issue

Dates

Created:: 25/Mar/13 17:14

Updated:: 24/Mar/22 06:50

Resolved:: 30/Aug/19 22:11

Time Tracking

Estimated:

Not Specified

Remaining:

0h

Logged:

3h 40m

Include sub-tasks