jobtracker's memory mainly used for TaskInProgress objects. We submit a Job with 100,087 tasks, jt's memory usage as follows:
Shallow size 29,625,752
Retained size 325,065,944 (96%)
Our optimization work as follows:
(1)Reduce duplicated strings
jobtracker stores too many duplicated strings, for example: splitClass name, splite locations, counters group name, couters name, display name, jtIdentifier of JobID, jobdir of MapOutputFile.
we use a StringCache reduced nearly 15% memory.
(2)Counters should be delay initialized
tips with no attempttask assigned should not create Counters.
(3)Reconstruct completed TIP's counters
when a task completed, the tip of this task become bigger because of counters. To speed up Counters update and lookup, Counters use HashMap and a cache, which cost too much memory. So we seperated counter values from Counters structure, all tasks share a CounterMap object, which map <CounterGroupName, CounterName> -> index of a long array, and every tip store a array of its counter values.
Using this method, JT's memory reduced nearly 50%.