That being said, I totally agree the TFile format for aggregated logs is not very fun to wield as a user. I don't know the thought process that went into choosing it, but I suspect it was a straightforward way to aggregate all of an app's logfiles on a node into a single file in HDFS.
The original reason why I picked TFile is programmatic access for users. With logs there are conflicting user cases - on one hand user would like them to be human readable and on the other hand people want to write tools. So I picked TFile for machine readability together with a log dumper to facilitate human readability.
Maybe one way to get the benefit of both easy-to-access logs and less namespace pressure is to go ahead and aggregate them as separate files but have a periodic process to archive logs in a har to reduce the namespace. That wouldn't address the significant additional write load this approach would place on the namenode, however.
My hope is reduced complexity at the log file level while punting the small file problem to the FS layer - the reasoning here being that not all filesystems which can be used on Hadoop have a small file problem!
Yes, because of the later issue (NameNode load), we should think before we make this leap. HDFS is the dominant FS that people use for YARN+MR jobs and YARN need to work well there.
Would it be helpful for YARN to supply a public API that reads the files for you?
We already have this. See AggregatedLogFormat and LogCLIHelpers.
Once we have more power in HDFS, it is very likely that we'll change this to be a single file + directory structure.
We can definitely move things around so that this concept of a per-node, per-app file is totally only for HDFS and for some other implementation we can have a single file. I am +1 if that is the goal - we just need to find and put appropriate abstractions.