Issue Details (XML | Word | Printable)

Key: HADOOP-2765
Type: New Feature New Feature
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Amareshwari Sriramadasu
Reporter: Joydeep Sen Sarma
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

setting memory limits for tasks

Created: 01/Feb/08 09:39 AM   Updated: 08/Jul/09 05:05 PM
Return to search
Component/s: None
Affects Version/s: 0.15.3
Fix Version/s: 0.17.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works 2765.1.patch 2008-03-10 06:36 AM Amareshwari Sriramadasu 18 kB
Text File Licensed for inclusion in ASF works patch-2765.txt 2008-03-03 06:38 AM Amareshwari Sriramadasu 16 kB
Text File Licensed for inclusion in ASF works patch-2765.txt 2008-03-03 04:35 AM Amareshwari Sriramadasu 16 kB
Text File Licensed for inclusion in ASF works patch-2765.txt 2008-02-29 12:33 PM Amareshwari Sriramadasu 16 kB
Text File Licensed for inclusion in ASF works patch-2765.txt 2008-02-25 09:17 AM Amareshwari Sriramadasu 9 kB
Text File Licensed for inclusion in ASF works patch-2765.txt 2008-02-21 05:03 AM Amareshwari Sriramadasu 10 kB
Text File Licensed for inclusion in ASF works patch-2765.txt 2008-02-20 12:32 PM Amareshwari Sriramadasu 10 kB
Text File Licensed for inclusion in ASF works patch-2765.txt 2008-02-20 09:11 AM Amareshwari Sriramadasu 10 kB
Text File Licensed for inclusion in ASF works patch-2765.txt 2008-02-19 11:03 AM Amareshwari Sriramadasu 9 kB

Hadoop Flags: Incompatible change
Release Note:
This feature enables specifying ulimits for streaming/pipes tasks. Now pipes and streaming tasks have same virtual memory available as the java process which invokes them. Ulimit value will be the same as -Xmx value for java processes provided using mapred.child.java.opts.
Resolution Date: 11/Mar/08 06:16 PM


 Description  « Hide
here's the motivation:

we want to put a memory limit on user scripts to prevent runaway scripts from bringing down nodes. this setting is much lower than the max. memory that can be used (since most likely these tend to be scripting bugs). At the same time - for careful users, we want to be able to let them use more memory by overriding this limit.

there's no good way to do this. we can set ulimit in hadoop shell scripts - but they are very restrictive. there doesn't seem to be a way to do a setrlimit from Java - and setting a ulimit means that supplying a higher Xmx limit from the jobconf is useless (the java process will be limited by the ulimit setting when the tasktracker was launched).

what we have ended up doing (and i think this might help others as well) is to have a stream.wrapper option. the value of this option is a program through which streaming mapper and reducer scripts are execed. in our case, this wrapper is small C program to do a setrlimit and then exec of the streaming job. the default wrapper puts a reasonable limit on the memory usage - but users can easily override this wrapper (eg by invoking it with different memory limit argument). we can use the wrapper for other system wide resource limits (or any environment settings) as well in future.

This way - JVMs can stick to mapred.child.opts as the way to control memory usage. This setup has saved our ass on many occasions while allowing sophisticated users to use high memory limits.

Can submit patch if this sounds interesting.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Devaraj Das added a comment - 01/Feb/08 12:54 PM
Please submit the patch. definitely interesting.

Runping Qi added a comment - 03/Feb/08 02:40 AM
+1

Amareshwari Sriramadasu added a comment - 13/Feb/08 12:20 PM
In stead of writing C wrapper, we can set memory limit using ulimit from bash where the child process exec happens.

setting a ulimit means that supplying a higher Xmx limit from the jobconf is useless (the java process will be limited by the ulimit setting when the tasktracker was launched).

Now instead of Xmx limit, we shall have ulimit value set by "mapred.child.memory". So, this memory limit will be valid for java tasks, pipes and streaming. (since all are execed by bash).

Thoughts?


Amareshwari Sriramadasu added a comment - 19/Feb/08 11:10 AM
Here is a patch to set memory limits for pipes and streaming tasks.
As suggested by Sameer in offline, I didnt introduce any new config item here.
Now Java tasks are still managed by Xmx limit. Pipes and streaming are set memory limit with max memory of parent java task. i.e. Pipes and streaming tasks are started with
ulimit -m Runtime.getRuntime().maxMemory() / 1024

Hadoop QA added a comment - 19/Feb/08 01:11 PM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12375915/patch-2765.txt
against trunk revision 619744.

@author +1. The patch does not contain any @author tags.

tests included -1. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.

javadoc +1. The javadoc tool did not generate any warning messages.

javac -1. The applied patch generated 624 javac compiler warnings (more than the trunk's current 623 warnings).

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs -1. The patch appears to introduce 1 new Findbugs warnings.

core tests -1. The patch failed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1815/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1815/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1815/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1815/console

This message is automatically generated.


Amareshwari Sriramadasu added a comment - 20/Feb/08 09:14 AM
This patch should fix javac and findbugs warnings.

Hadoop QA added a comment - 20/Feb/08 10:31 AM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12376006/patch-2765.txt
against trunk revision 619744.

@author +1. The patch does not contain any @author tags.

tests included -1. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.

javadoc -1. The javadoc tool appears to have generated 1 warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1819/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1819/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1819/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1819/console

This message is automatically generated.


Amareshwari Sriramadasu added a comment - 20/Feb/08 12:35 PM
Sorry for the javadoc warning that was by mistake that was an attempt to get off javac warning.
Even though the method is annotated with @deprecated, I could see the following javac warning:
[javac] src/java/org/apache/hadoop/mapred/TaskLog.java:204: warning: [dep-ann] deprecated name isnt annotated with @Deprecated

Hadoop QA added a comment - 20/Feb/08 12:47 PM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12376014/patch-2765.txt
against trunk revision 619744.

@author +1. The patch does not contain any @author tags.

tests included -1. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.

javadoc +1. The javadoc tool did not generate any warning messages.

javac -1. The applied patch generated 620 javac compiler warnings (more than the trunk's current 619 warnings).

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs -1. The patch appears to cause Findbugs to fail.

core tests -1. The patch failed core unit tests.

contrib tests -1. The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1820/testReport/
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1820/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1820/console

This message is automatically generated.


Amareshwari Sriramadasu added a comment - 21/Feb/08 05:02 AM
It seems like /* @deprecated */ is deprecated. And we have to use @Deprecated at the method starting. Sorry for wrong patch.

Amareshwari Sriramadasu added a comment - 21/Feb/08 05:04 AM
Submiting patch that fixes javac, javadoc and findbugs warnings.

Hadoop QA added a comment - 21/Feb/08 06:09 AM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12376090/patch-2765.txt
against trunk revision 619744.

@author +1. The patch does not contain any @author tags.

tests included -1. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1823/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1823/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1823/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1823/console

This message is automatically generated.


Devaraj Das added a comment - 22/Feb/08 08:41 AM
Some comments:
1) I don't see why we should deprecate captureOutAndErr. We could just add a new method that takes an additional arg setupCmd.
2) Shouldn't the addSetupCommand method put the strings in qoutes (single qoutes similar to what is done in addCommand). Also the semi-colon should be probably added after the for-loop terminates. And there should be a space between each of the strings (very similar to what is done in addCommand). Now the question is whether we really require an additional method addSetupCommand ?

Amareshwari Sriramadasu added a comment - 25/Feb/08 09:18 AM
Incorporated Devaraj's comments.
Patch doesnt deprecate captureOutAndErr. and addCommand is used instead of addSetupCommand

Hadoop QA added a comment - 27/Feb/08 07:12 AM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12376389/patch-2765.txt
against trunk revision 619744.

@author +1. The patch does not contain any @author tags.

tests included -1. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1840/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1840/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1840/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1840/console

This message is automatically generated.


Amareshwari Sriramadasu added a comment - 29/Feb/08 12:43 PM - edited
The patch addresses following issues:
1. The memory limit setting is done by ulimit -v (instead of ulimit -m as in previous patch). Since ulimit -v is the maximum amount of virtual memory available for shell.
2. Now that all streaming tasks get same virtual memory as the parent java task, lauching a java streaming task (eg. TrApp.class) would require more memory than 256MB (set in build-contrib.xml). Noticed this in unit tests. So that value for maxmemory is increased to 384MB in build-contrib.xml. This is required for the existing unit tests for streaming to pass .
3. Added a testcase in contrib/streaming. The test will launch a streaming app which will allocate 10MB memory. First, program is launched with sufficient memory. And test expects it to succeed. Then program is launched with insufficient memory and is expected to be a failure.

Hadoop QA added a comment - 29/Feb/08 02:10 PM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12376819/patch-2765.txt
against trunk revision 619744.

@author +1. The patch does not contain any @author tags.

tests included +1. The patch appears to include 8 new or modified tests.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests -1. The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1876/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1876/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1876/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1876/console

This message is automatically generated.


Amareshwari Sriramadasu added a comment - 03/Mar/08 04:35 AM
I think contrib tests failed because of maxmemory being 384MB. I'm increaing it to 512MB. Tests passed in my machine with both 384MB and 512MB.

Hadoop QA added a comment - 03/Mar/08 05:59 AM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12376951/patch-2765.txt
against trunk revision 619744.

@author +1. The patch does not contain any @author tags.

tests included +1. The patch appears to include 8 new or modified tests.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests -1. The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1883/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1883/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1883/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1883/console

This message is automatically generated.


Amareshwari Sriramadasu added a comment - 03/Mar/08 06:36 AM
Tests failed even with 512MB. Canceling the patch.

Amareshwari Sriramadasu added a comment - 03/Mar/08 06:38 AM
Increasing maxmemory to 1024MB.

Hadoop QA added a comment - 03/Mar/08 08:13 AM
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12376954/patch-2765.txt
against trunk revision 619744.

@author +1. The patch does not contain any @author tags.

tests included +1. The patch appears to include 8 new or modified tests.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1884/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1884/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1884/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1884/console

This message is automatically generated.


Nigel Daley added a comment - 05/Mar/08 03:52 AM
This is deemed too risky for a bug fix release. Deferring to 0.17.

Devaraj Das added a comment - 06/Mar/08 06:47 PM
sigh This patch doesn't work on cygwin. ulimit seems to be not well supported there. I propose we disable this ulimit thing for hadoop on cygwin.

Devaraj Das added a comment - 08/Mar/08 09:55 AM
Cancelling patch since this doesn't work on cygwin and MAC.

Amareshwari Sriramadasu added a comment - 10/Mar/08 06:36 AM
Patch, from Devaraj, which fixes tests on windows and MAC.

Hadoop QA added a comment - 10/Mar/08 08:28 AM
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12377506/2765.1.patch
against trunk revision 619744.

@author +1. The patch does not contain any @author tags.

tests included +1. The patch appears to include 8 new or modified tests.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1926/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1926/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1926/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1926/console

This message is automatically generated.


Devaraj Das added a comment - 11/Mar/08 06:16 PM
I just committed this.

Hudson added a comment - 12/Mar/08 12:18 PM