I think AuditLogManager can be pulled into a separate class file.
I would instead suggest that the Keys be defined as an Enum
I am not sure 'agent' is conveying the right meaning.
HDFS uses ugi. As we discussed, we would be logging the usernames in MapReduce. I think agent can act as a common word for ugi or username. Agent is someone who is external to the system and tries to cause a state change. I am ok with changing it to something better. I named the IP key as 'ip' because hdfs kept it that way.
Reason seems confusing.
I have removed it in my next patch. It doesnt add much value to MapReduce because we are dealing with authorization events here. The reason why I made them enums was for ease of parsing. Having strings in key=value formats can lead to issues to do with spaces etc.
Don't tabs make the lines appear too long,
This is inline with hdfs.
Some values like 'permissions' (if this is a free format string) can have whitespaces
I would rather prefer having one word representation for permissions for audit logs. Example
Key=Value pairs with free format strings can lead to issues. We can quote them but the parser logic would become complex.
Suggest the parameter 'operation' could be of type String as well.
I would rather prefer Enums and have JobTracker.Operations and JobInProgress.Operations. Note we already have QueueManager.QueueOperations.
If there's no cost to logging remote-ip,
The comment '//for testcases' is misleading.
Why are we logging success again in JobTracker.addJob, wouldn't the same line come up in QueueManager.checkAccess ?
Because they are 2 authorization checks involved :
1) Queue access (success or failure)
2) Job submission (success or failure based on username in config)
Seems like Operations "REFRESH_NODES", "QUEUE_REFRESH" and target "Jobtracker" can be static final variables. Should we pull all such AuditLogManager related constants into some constants file ? Like AuditLogConstants or something ?
+1 for factoring it out.
I don't think we need to give 'Queue:' prefix,
Failure logging should be at WARN level - as done in the other audit log.
If audit logs are important then what is the point of making some of them WARN, ERROR or FATAL? Shouldn't all the audit logs be INFO logs?
Can we move the refreshAcls logging into QueueManager.refreshQueueAcls - so that failures can be logged too.
Can we add a few simple unit tests that test the formatting of the log message alone