[HDFS-4680] Audit logging of delegation tokens for MR tracing - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 2.0.3-alpha
Fix Version/s: 2.1.1-beta
Component/s: namenode, security
Labels:
None

Target Version/s:

2.1.1-beta

Description

HDFS audit logging tracks HDFS operations made by different users, e.g. creation and deletion of files. This is useful for after-the-fact root cause analysis and security. However, logging merely the username is insufficient for many usecases. For instance, it is common for a single user to run multiple MapReduce jobs (I believe this is the case with Hive). In this scenario, given a particular audit log entry, it is difficult to trace it back to the MR job or task that generated that entry.

I see a number of potential options for implementing this.

1. Make an optional "client name" field part of the NN RPC format. We already pass a clientName as a parameter in many RPC calls, so this would essentially make it standardized. MR tasks could then set this field to the job and task ID.
2. This could be generalized to a set of optional key-value tags in the NN RPC format, which would then be audit logged. This has standalone benefits outside of just verifying MR task ids.
3. Neither of the above two options actually securely verify that MR clients are who they claim they are. Doing this securely requires the JobTracker to sign MR task attempts, and then having the NN verify this signature. However, this is substantially more work, and could be built on after idea #2.

Thoughts welcomed.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

hdfs-4680-1.patch
31/Jul/13 05:38
5 kB
Andrew Wang
hdfs-4680-2.patch
06/Aug/13 17:44
12 kB
Andrew Wang
hdfs-4680-3.patch
08/Aug/13 21:29
11 kB
Andrew Wang
hdfs-4680-4.patch
03/Sep/13 01:56
14 kB
Andrew Wang
hdfs-4680-5.patch
09/Sep/13 05:17
17 kB
Andrew Wang

Issue Links

is related to

HADOOP-9775 Add tracking IDs to FS tokens to allow tracing FS operations back to job

Resolved

relates to

HDFS-9184 Logging HDFS operation's caller context into audit logs

Resolved

MAPREDUCE-5379 Include token tracking ids in jobconf

Closed

Activity

People

Assignee:: Andrew Wang

Reporter:: Andrew Wang

Votes:: 0 Vote for this issue

Watchers:: 18 Start watching this issue

Dates

Created:: 09/Apr/13 22:21

Updated:: 01/Oct/15 20:31

Resolved:: 11/Sep/13 20:05