Instead of storing the UGI with the submitted job, please store the user as a string. That will be forward compatible when we move to server-side groups. I think it makes sense to do as part of this patch, if it isn't already being done.
The jobconf already has the username. Are you saying that the JT should maintain the mapping from the jobID to the username who was given this jobID (step 1 in the jobsubmission protocol), so that in the the following RPC the JT would be able to efficiently look up the username based on the jobID, rather than having to parse the conf to get it?
The meta information should only include the offset, since the length is redundant with the following split's start.
We use the binary format instead of xml to store the jobconf. However, when loading the binary format, we need to handle the final parameters.
The conf is serialized using Configuration's write(DataOutput) that actually serializes everything out as strings. The JobTracker then writes the read configuration in the mapred.system.dir using Configuration.writeXml. The JobInProgress constructor loads the conf in the normal way (in the way it happens today). So final parameters defined in the JobTracker will be taken care of in the usual way.
I'm not very happy with half of the job information being saved in the system directory and half of it in the staging directory. I assume that the staging directory is required to be on the same file system as the system directory? Having the job's definition split into two directories with two different owners seems bad. That is especially true since the data in the system directory will point to particular byte offsets in the staging directory. I think we will be in for some really nasty bugs involving
The way I am seeing it is that the JobTracker is given only that piece of information that's required to launch the job. Things like job.jar, the split bytes, the distributed cache files, and anything else the users want to use in the job, are things required by the tasks which the JT doesn't care about. Every piece of information is generated by the client. If the client had generated the wrong information about the byte offsets, only his job gets affected.
Your sentence about the "nasty bugs" is incomplete..
I assume the cleanup of the staging directory is done by the JobTracker.
Done as part of the job cleanup task.
I guess I would be happier, if as part of JobSubmission, we moved the files from the user's staging area into the system dir. The JobTracker would read (possibly with a cache) the bytes for the task and send them to the user as part of the task definition.
The split bytes file has a high replication factor of 10 (and it could be something like what Doug suggested). So do we really want the JT to copy the bytes to the system dir. I am trying to weigh the options of letting the tasks read the split bytes from the split file directly versus the JT passing the same in the task definition. The former reduces load on the JT (it doesn't have to load the split bytes in memory at all).