I think Hong is referring to configuration parameters that are likely to modify the behaviour of the job and tasks (e.g mapred.child.* , mapreduce.job.* etc).
No, this is not what this jira intends to solve. But this jira could potentially help. Currently Rumen extracts from jobconf.xml some key-values specific to map-reduce layer, and converts them to regular primitive types. I think the extraction of mapred.child.* and mapreduce.job.* etc should continue along this path.
However, we start to think of using Rumen output to analyze performance of frameworks on top of map-reduce. One example is Pig. Pig will add more information in jobconf.xml to describe the features being used, and compile-time statistics, We need to have a mechanism in Rumen to retain such information in an extensible way, and is the primary purpose of this jira.
Also *-default.xml might not be available for reference comparison.
Correct. That is the main reason we have to make each parsed LoggedJob instance self-contained.
Hmm. But I guess we need to bring in more and more configuration properties soon.
Yes, it will be, but not unbounded. I think we can support extraction of properties based on exact match or prefixes.
MAPREDUCE-2153 to get other needed configuration properties in to the trace file.
This seems to be in addition to
MAPREDUCE-1658. I suggest you roll two jiras into one (closing MR-1658 and roll the work int oMR-2153).
Also created MAPREDUCE-2152 for avoiding TraceBuilder's its own handling of deprecated configuration properties in favour of Configuration object.
The purpose of this jira is to extend the set of key-values to be extracted by jobconf parser and retain them as-is in LoggedJob object. So I believe your point is relatively orthogonal to this jira. FWIW, I am a bit concerned to introduce this dependency between Rumen and MapReduce because I think the handling deprecated conf parameters is not really a core part of MapReduce API and could be dropped in the future (which would lead us to move the code into Rumen - similar to the case of Pre21JobHistoryConstants).