I don't think this is a good idea. This is going to result in MASSIVE log increases with large MR jobs.
It will create a lot of logging for a large job, but we're already logging a not-so-useful message per shuffle connection after
MAPREDUCE-5787. And having a useful message per shuffle connection is very useful for tracking down abusive jobs whose shuffle phase causes NM file descriptors, network or disk to go haywire.
Allen, would it mitigate your concerns if this were logged with a separately configurable logger, e.g.: ShuffleHandlerAuditLogger? That way users could configure it on when they want to audit shuffle transfers or off when they don't.
Other comments on the patch: please don't add newlines to the output, it just makes the logs visibly long. I'd prefer a brief one-line message per connection, and IMHO it's redundant to label a job ID with "job:" since the jobID has "job_" in it already.