While instrumenting the INPUT_SUPERSTEP and watching various runs, I see the input split list generated every time a worker calls reserveInputSplit is, for all intents and purposes, immutable per job. Therefore, we can save a fair amount of memory by not re-creating the list and re-querying ZooKeeper on each pass to claim another split. Only the reserved and finished children lists are ever mutated during the input phase of the job.
|Transition||Time In Source Status||Execution Times||Last Executer||Last Execution Date|
|6m 47s||1||Eli Reisman||18/Aug/12 21:19|
|48d 20h 35m||1||Avery Ching||06/Oct/12 17:54|
|Status||Patch Available [ 10002 ]||Resolved [ 5 ]|
|Resolution||Fixed [ 1 ]|
|Status||Open [ 1 ]||Patch Available [ 10002 ]|