Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-2765

setting memory limits for tasks

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.15.3
    • 0.17.0
    • None
    • None
    • Incompatible change
    • Hide
      This feature enables specifying ulimits for streaming/pipes tasks. Now pipes and streaming tasks have same virtual memory available as the java process which invokes them. Ulimit value will be the same as -Xmx value for java processes provided using mapred.child.java.opts.
      Show
      This feature enables specifying ulimits for streaming/pipes tasks. Now pipes and streaming tasks have same virtual memory available as the java process which invokes them. Ulimit value will be the same as -Xmx value for java processes provided using mapred.child.java.opts.

    Description

      here's the motivation:

      we want to put a memory limit on user scripts to prevent runaway scripts from bringing down nodes. this setting is much lower than the max. memory that can be used (since most likely these tend to be scripting bugs). At the same time - for careful users, we want to be able to let them use more memory by overriding this limit.

      there's no good way to do this. we can set ulimit in hadoop shell scripts - but they are very restrictive. there doesn't seem to be a way to do a setrlimit from Java - and setting a ulimit means that supplying a higher Xmx limit from the jobconf is useless (the java process will be limited by the ulimit setting when the tasktracker was launched).

      what we have ended up doing (and i think this might help others as well) is to have a stream.wrapper option. the value of this option is a program through which streaming mapper and reducer scripts are execed. in our case, this wrapper is small C program to do a setrlimit and then exec of the streaming job. the default wrapper puts a reasonable limit on the memory usage - but users can easily override this wrapper (eg by invoking it with different memory limit argument). we can use the wrapper for other system wide resource limits (or any environment settings) as well in future.

      This way - JVMs can stick to mapred.child.opts as the way to control memory usage. This setup has saved our ass on many occasions while allowing sophisticated users to use high memory limits.

      Can submit patch if this sounds interesting.

      Attachments

        1. patch-2765.txt
          9 kB
          Amareshwari Sriramadasu
        2. patch-2765.txt
          10 kB
          Amareshwari Sriramadasu
        3. patch-2765.txt
          10 kB
          Amareshwari Sriramadasu
        4. patch-2765.txt
          10 kB
          Amareshwari Sriramadasu
        5. patch-2765.txt
          9 kB
          Amareshwari Sriramadasu
        6. patch-2765.txt
          16 kB
          Amareshwari Sriramadasu
        7. patch-2765.txt
          16 kB
          Amareshwari Sriramadasu
        8. patch-2765.txt
          16 kB
          Amareshwari Sriramadasu
        9. 2765.1.patch
          18 kB
          Amareshwari Sriramadasu

        Activity

          People

            amareshwari Amareshwari Sriramadasu
            jsensarma Joydeep Sen Sarma
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: