Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4752

Reduce MR AM memory usage through String Interning

VotersStop watchingWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 2.0.3-alpha, 0.23.5
    • mrv2
    • None

    Description

      There are a lot of strings that are duplicates of one another in the AM. This comes from all of the PB events the come across the wire and also tasks heart-beating in through the umbilical. There are even several duplicates from Configuration. By "interning" all of these strings on the Heap I have been able to reduce the resting memory usage of the AM to be about 5KB per task attempt. With about half of this coming from counters. This results in a 5MB heap for a typical 1000 task job, or a 500MB heap for a 100,000 task attempt job. I think I could cut the size of the counters in half by completely rewriting how counters work in the AM and History Server, but I don't think it is worth it at this point.

      I am still investigating what the memory usage of the AM is like when running very large jobs, and I will probably have a follow-up JIRA for reducing that memory usage as well.

      Attachments

        1. MR-4752-trunk.txt
          32 kB
          Robert Joseph Evans
        2. MR-4752-branch-0.23.txt
          33 kB
          Robert Joseph Evans
        3. MR-4752-trunk.txt
          30 kB
          Robert Joseph Evans
        4. MR-4752-branch-0.23.txt
          31 kB
          Robert Joseph Evans

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            revans2 Robert Joseph Evans
            revans2 Robert Joseph Evans
            Votes:
            0 Vote for this issue
            Watchers:
            5 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment