Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.24.0
    • Component/s: jobtracker
    • Labels:

      Description

      Too many tasks will eat up a considerable amount of JobTracker's heap space. According to our observation, 50GB heap size can support to 5,000,000 tasks, so we should optimize jobtracker's memory usage for more jobs and tasks. Yourkit java profile show that counters, duplicate strings, task waste too much memory. Our optimization around these three points reduced jobtracker's memory to 1/3.

        Activity

        Hide
        MengWang added a comment -

        Jobtracker's memory mainly user for TaskInProgress objects. We submit a Job with 100,087 tasks, jt's memory usage as follows:

        Show
        MengWang added a comment - Jobtracker's memory mainly user for TaskInProgress objects. We submit a Job with 100,087 tasks, jt's memory usage as follows:
        Hide
        MengWang added a comment -

        jobtracker's memory mainly used for TaskInProgress objects. We submit a Job with 100,087 tasks, jt's memory usage as follows:
        org.apache.hadoop.mapred.TaskInProgress
        object 100,087
        Shallow size 29,625,752
        Retained size 325,065,944 (96%)

        Our optimization work as follows:
        (1)Reduce duplicated strings
        jobtracker stores too many duplicated strings, for example: splitClass name, splite locations, counters group name, couters name, display name, jtIdentifier of JobID, jobdir of MapOutputFile.
        we use a StringCache reduced nearly 15% memory.
        (2)Counters should be delay initialized
        tips with no attempttask assigned should not create Counters.
        (3)Reconstruct completed TIP's counters
        when a task completed, the tip of this task become bigger because of counters. To speed up Counters update and lookup, Counters use HashMap and a cache, which cost too much memory. So we seperated counter values from Counters structure, all tasks share a CounterMap object, which map <CounterGroupName, CounterName> -> index of a long array, and every tip store a array of its counter values.
        Using this method, JT's memory reduced nearly 50%.

        Show
        MengWang added a comment - jobtracker's memory mainly used for TaskInProgress objects. We submit a Job with 100,087 tasks, jt's memory usage as follows: org.apache.hadoop.mapred.TaskInProgress object 100,087 Shallow size 29,625,752 Retained size 325,065,944 (96%) Our optimization work as follows: (1)Reduce duplicated strings jobtracker stores too many duplicated strings, for example: splitClass name, splite locations, counters group name, couters name, display name, jtIdentifier of JobID, jobdir of MapOutputFile. we use a StringCache reduced nearly 15% memory. (2)Counters should be delay initialized tips with no attempttask assigned should not create Counters. (3)Reconstruct completed TIP's counters when a task completed, the tip of this task become bigger because of counters. To speed up Counters update and lookup, Counters use HashMap and a cache, which cost too much memory. So we seperated counter values from Counters structure, all tasks share a CounterMap object, which map <CounterGroupName, CounterName> -> index of a long array, and every tip store a array of its counter values. Using this method, JT's memory reduced nearly 50%.
        Hide
        Arun C Murthy added a comment -

        Meng, interesting analysis, thanks!

        To be perfectly honest, I'm surprised you guys are seeing this many memory issues with the JT... what version of the Hadoop Map-Reduce are you running? A simple solution we have deployed at Yahoo! for a long while now is to aggressively cut down #completed jobs in memory which has helped a lot. Something to consider for you guys.

        Show
        Arun C Murthy added a comment - Meng, interesting analysis, thanks! To be perfectly honest, I'm surprised you guys are seeing this many memory issues with the JT... what version of the Hadoop Map-Reduce are you running? A simple solution we have deployed at Yahoo! for a long while now is to aggressively cut down #completed jobs in memory which has helped a lot . Something to consider for you guys.
        Hide
        Kang Xiao added a comment -

        Thanks Arun, it's really a good solution to retire completed jobs from memory and the function is in trunk. But how about a running job with tens of thousands of tasks? We see that big running jobs use much memory in the cluster.

        Show
        Kang Xiao added a comment - Thanks Arun, it's really a good solution to retire completed jobs from memory and the function is in trunk. But how about a running job with tens of thousands of tasks? We see that big running jobs use much memory in the cluster.
        Hide
        MengWang added a comment -

        Thanks Kang, you got it.

        Show
        MengWang added a comment - Thanks Kang, you got it.
        Hide
        Allen Wittenauer added a comment -

        > But how about a running job with tens of thousands of tasks? We see that big running
        > jobs use much memory in the cluster.

        This is almost always a sign that either the data being read is not laid out efficiently/too small of block size, that one needs to use CombinedFileInputFormat, or there just too many reducers in play. There is almost never a reason to have jobs in the x0,000 area unless the dataset is Just That Big.

        Show
        Allen Wittenauer added a comment - > But how about a running job with tens of thousands of tasks? We see that big running > jobs use much memory in the cluster. This is almost always a sign that either the data being read is not laid out efficiently/too small of block size, that one needs to use CombinedFileInputFormat, or there just too many reducers in play. There is almost never a reason to have jobs in the x0,000 area unless the dataset is Just That Big.

          People

          • Assignee:
            Unassigned
            Reporter:
            MengWang
          • Votes:
            1 Vote for this issue
            Watchers:
            16 Start watching this issue

            Dates

            • Created:
              Updated:

              Development