It will be good to have the org.apache.hadoop.mapreduce.Cluster create the rpc client object only when needed (when a call to the jobtracker is actually required). org.apache.hadoop.mapreduce.Job constructs the Cluster object internally and in many cases the application that created the Job object really wants to look at the configuration only. It'd help to not have these connections to the jobtracker especially when Job is used in the tasks (for e.g., Pig calls mapreduce.FileInputFormat.setInputPath in the tasks and that requires a Job object to be passed).
In Hadoop 20, the Job object internally creates the JobClient object, and the same argument applies there too.
|Field||Original Value||New Value|
|Summary||Job class should create the rpc client only when needed||Cluster class should create the rpc client only when needed|
|Release Note||Lazily construct a connection to the JobTracker from the job-submission client.|
|Assignee||Dick King [ dking ]|
|Status||Patch Available [ 10002 ]||Open [ 1 ]|
|Status||Patch Available [ 10002 ]||Resolved [ 5 ]|
|Resolution||Fixed [ 1 ]|
|Status||Resolved [ 5 ]||Closed [ 6 ]|
|Transition||Time In Source Status||Execution Times||Last Executer||Last Execution Date|
|5d 2h 19m||1||Arun C Murthy||24/May/10 20:33|
|92d 17h 24m||2||Dick King||27/May/10 17:40|
|9d 7h 47m||1||Chris Douglas||06/Jun/10 01:28|
|554d 4h 51m||1||Konstantin Shvachko||12/Dec/11 06:19|