Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.3.5
-
None
-
None
Description
the s3a auditing feature of HADOOP-17511 is wonderful in production, as we can
- scan for all IO done by a single user, job, operation
- get a comprehensive view of what was done to the store, in the order it saw it, which doesn't always match the order the client logged it, especially logged outcomes as that only happens when the threads get scheduled.
the abfs TracingContext effectively does most of this. All that is needed is to integrate with the hadoop-common audit interfaces/apis
- TracingContext to pick up global/thread-local audit contexts in construction and so extract information (tool class; principal; spark job id. etc)
The manifest committer already sets task id on the active thread whenever entered; this persists through all the IO done by the spark worker.
Attachments
Issue Links
- relates to
-
HADOOP-17511 Add an Audit plugin point for S3A auditing/context
- Resolved
- links to