The easiest way to implement this will be by adding a log4j appender that emits events from FSNamesystem. This way, it can be turned off by default but enabled/configured by administrators. The subset of events should probably be restricted to those mapped to DFSClient calls. As a first pass: create (startFile), mkdirs, setOwner, setPermission, delete, rename, open (getBlockLocations?), getFileStatus, setReplication, and listStatus all look like reasonable events to log. For all events, the ugi and path will be logged (date/time, etc. should be handled by the appender). For create, mkdirs, setOwner, and setPermission, both the ugi and the FsPermission information will be logged.
Thoughts? This isn't designed to be a secure audit log- and I'm sure issues like
HADOOP-1741 will affect the approach to future audit logging- but it should provide sufficient information for administrators to manage HDFS.