It will be good to have the org.apache.hadoop.mapreduce.Cluster create the rpc client object only when needed (when a call to the jobtracker is actually required). org.apache.hadoop.mapreduce.Job constructs the Cluster object internally and in many cases the application that created the Job object really wants to look at the configuration only. It'd help to not have these connections to the jobtracker especially when Job is used in the tasks (for e.g., Pig calls mapreduce.FileInputFormat.setInputPath in the tasks and that requires a Job object to be passed).
In Hadoop 20, the Job object internally creates the JobClient object, and the same argument applies there too.
- is depended upon by
MAPREDUCE-118 Job.getJobID() will always return null