Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1317

Reducing memory consumption of rumen objects

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0, 0.22.0
    • Fix Version/s: 0.21.0
    • Component/s: tools/rumen
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      We have encountered OutOfMemoryErrors in mumak and gridmix when dealing with very large jobs. The purpose of this jira is to optimze memory consumption of rumen produced job objects.

      1. 3623945-yahoo-20-1xx.patch
        0.6 kB
        rahul k singh
      2. ASF.LICENSE.NOT.GRANTED--mapreduce-1317-yhadoo-20.1xx.patch
        10 kB
        Hong Tang
      3. mapreduce-1317-20091218.patch
        2 kB
        Hong Tang
      4. mapreduce-1317-20091222.patch
        10 kB
        Hong Tang
      5. mapreduce-1317-20091222-2.patch
        10 kB
        Hong Tang
      6. mapreduce-1317-20091223.patch
        10 kB
        Hong Tang

        Activity

        Hide
        rahul k singh added a comment -

        Adding the patch for 20.1xx branch. This resolves the NPE incase of setHostName.

        Show
        rahul k singh added a comment - Adding the patch for 20.1xx branch. This resolves the NPE incase of setHostName.
        Hide
        Hong Tang added a comment -

        patch for yahoo hadoop 20.1xx branch. Not to be committed.

        Show
        Hong Tang added a comment - patch for yahoo hadoop 20.1xx branch. Not to be committed.
        Hide
        Hong Tang added a comment -

        Patch "mapreduce-1317-20091223.patch" applies cleanly to yahoop-hadoop-0.20.1xx branch.

        Show
        Hong Tang added a comment - Patch "mapreduce-1317-20091223.patch" applies cleanly to yahoop-hadoop-0.20.1xx branch.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #200 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/200/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #200 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/200/ )
        Hide
        Chris Douglas added a comment -

        +1

        I committed this. Thanks, Hong!

        Show
        Chris Douglas added a comment - +1 I committed this. Thanks, Hong!
        Hide
        Tamas Sarlos added a comment -

        Patch looks good to me.

        Show
        Tamas Sarlos added a comment - Patch looks good to me.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12428801/mapreduce-1317-20091223.patch
        against trunk revision 894964.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/354/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/354/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/354/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/354/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12428801/mapreduce-1317-20091223.patch against trunk revision 894964. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/354/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/354/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/354/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/354/console This message is automatically generated.
        Hide
        Hong Tang added a comment -

        The failed tests are not related to the patch.

        Show
        Hong Tang added a comment - The failed tests are not related to the patch.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12428801/mapreduce-1317-20091223.patch
        against trunk revision 893800.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/342/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/342/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/342/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/342/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12428801/mapreduce-1317-20091223.patch against trunk revision 893800. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/342/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/342/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/342/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/342/console This message is automatically generated.
        Hide
        Hong Tang added a comment -

        The failed tests are not related to the patch.

        Show
        Hong Tang added a comment - The failed tests are not related to the patch.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12428801/mapreduce-1317-20091223.patch
        against trunk revision 893469.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/339/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/339/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/339/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/339/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12428801/mapreduce-1317-20091223.patch against trunk revision 893469. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/339/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/339/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/339/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/339/console This message is automatically generated.
        Hide
        Hong Tang added a comment -

        Hudson is broken, retrial.

        Show
        Hong Tang added a comment - Hudson is broken, retrial.
        Hide
        Hong Tang added a comment -

        The 2 failed unit tests in rumen were caused by my false assumption that LoggedXXX objects are immutable - while in fact the HadoopLogAnalyzer actually mutates the List<LoggedTaskAttempt> object returned from the getter method. I restore the original semantics by creating an empty list instead of using Collections.emptyList().

        I filed MAPREDUCE-1330 to propose to make LoggedXXX APIs more consistent in this regard.

        Show
        Hong Tang added a comment - The 2 failed unit tests in rumen were caused by my false assumption that LoggedXXX objects are immutable - while in fact the HadoopLogAnalyzer actually mutates the List<LoggedTaskAttempt> object returned from the getter method. I restore the original semantics by creating an empty list instead of using Collections.emptyList(). I filed MAPREDUCE-1330 to propose to make LoggedXXX APIs more consistent in this regard.
        Hide
        Hong Tang added a comment -

        New patch that fixes the failed unit test.

        Show
        Hong Tang added a comment - New patch that fixes the failed unit test.
        Hide
        Hong Tang added a comment -

        Fixing bugs found through unit tests.

        Show
        Hong Tang added a comment - Fixing bugs found through unit tests.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12428791/mapreduce-1317-20091222-2.patch
        against trunk revision 893361.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/239/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/239/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/239/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/239/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12428791/mapreduce-1317-20091222-2.patch against trunk revision 893361. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/239/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/239/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/239/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/239/console This message is automatically generated.
        Hide
        Hong Tang added a comment -

        Try hudson again.

        Show
        Hong Tang added a comment - Try hudson again.
        Hide
        Hong Tang added a comment -

        New patch that fixes the findbugs warning. The failed contrib tests are known and not related to the changes.

        Also used Collections.emptyList() instead of creating my own copy.

        Show
        Hong Tang added a comment - New patch that fixes the findbugs warning. The failed contrib tests are known and not related to the changes. Also used Collections.emptyList() instead of creating my own copy.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12428720/mapreduce-1317-20091222.patch
        against trunk revision 893055.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 2 new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/334/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/334/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/334/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/334/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12428720/mapreduce-1317-20091222.patch against trunk revision 893055. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/334/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/334/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/334/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/334/console This message is automatically generated.
        Hide
        Hong Tang added a comment -

        I spoke too soon. The cache needs to be properly synchronized because although we expect LoggedLocation objects are created through JSON library and should be read only afterwards, the cache may be accessed concurrently, and thus should be properly synchronized.

        Also found a few other minor improvements that I should incorporate.

        With these, i think we also need to add a unit test to ensure the code runs properly with multiple threads.

        Show
        Hong Tang added a comment - I spoke too soon. The cache needs to be properly synchronized because although we expect LoggedLocation objects are created through JSON library and should be read only afterwards, the cache may be accessed concurrently, and thus should be properly synchronized. Also found a few other minor improvements that I should incorporate. With these, i think we also need to add a unit test to ensure the code runs properly with multiple threads.
        Hide
        Hong Tang added a comment -

        Straight forward patch. No additional test is added because it does not change the semantics of the modified classes, and existing unit tests should provide enough coverage.

        Show
        Hong Tang added a comment - Straight forward patch. No additional test is added because it does not change the semantics of the modified classes, and existing unit tests should provide enough coverage.
        Hide
        Hong Tang added a comment -

        Through YourKit profiling, we found two places where we could save memory:

        • LoggedLocation - we should share references to the same LoggedLocation for the same preferred location for different map tasks.
        • LoggedTaskAttempt.hostName - we should keep a cache of all host names for the cluster and share the references.
        Show
        Hong Tang added a comment - Through YourKit profiling, we found two places where we could save memory: LoggedLocation - we should share references to the same LoggedLocation for the same preferred location for different map tasks. LoggedTaskAttempt.hostName - we should keep a cache of all host names for the cluster and share the references.

          People

          • Assignee:
            Hong Tang
            Reporter:
            Hong Tang
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development