Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1783

Task Initialization should be delayed till when a job can be run

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.1
    • Fix Version/s: 0.22.0, 0.23.0
    • Component/s: contrib/fair-share
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The FairScheduler task scheduler uses PoolManager to impose limits on the number of jobs that can be running at a given time. However, jobs that are submitted are initiaiized immediately by EagerTaskInitializationListener by calling JobInProgress.initTasks. This causes the job split file to be read into memory. The split information is not needed until the number of running jobs is less than the maximum specified. If the amount of split information is large, this leads to unnecessary memory pressure on the Job Tracker.
      To ease memory pressure, FairScheduler can use another implementation of JobInProgressListener that is aware of PoolManager limits and can delay task initialization until the number of running jobs is below the maximum.

      1. MAPREDUCE-1783.patch
        13 kB
        Ramkumar Vadali
      2. submit-mapreduce-1783.patch
        29 kB
        Ramkumar Vadali
      3. 0001-Pool-aware-job-initialization.patch.1
        30 kB
        Ramkumar Vadali
      4. 0001-Pool-aware-job-initialization.patch
        30 kB
        Ramkumar Vadali

        Activity

        Hide
        Scott Chen added a comment -

        initializing should not be set back to false otherwise there may be a race condition.

        Show
        Scott Chen added a comment - initializing should not be set back to false otherwise there may be a race condition.
        Hide
        Ramkumar Vadali added a comment -

        Made a fix per Scott's comments.

        Show
        Ramkumar Vadali added a comment - Made a fix per Scott's comments.
        Hide
        Scott Chen added a comment -

        +1
        The patch looks good to me.

        Show
        Scott Chen added a comment - +1 The patch looks good to me.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12445113/0001-Pool-aware-job-initialization.patch.1
        against trunk revision 946833.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 4 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/541/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12445113/0001-Pool-aware-job-initialization.patch.1 against trunk revision 946833. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/541/console This message is automatically generated.
        Hide
        Ramkumar Vadali added a comment -

        Patch was not generated correctly

        Show
        Ramkumar Vadali added a comment - Patch was not generated correctly
        Hide
        Ramkumar Vadali added a comment -

        Formatted patch, this should work.

        Show
        Ramkumar Vadali added a comment - Formatted patch, this should work.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12445178/submit-mapreduce-1783.patch
        against trunk revision 946955.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/543/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/543/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/543/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/543/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12445178/submit-mapreduce-1783.patch against trunk revision 946955. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/543/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/543/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/543/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/543/console This message is automatically generated.
        Hide
        Ramkumar Vadali added a comment -

        Trying again

        Show
        Ramkumar Vadali added a comment - Trying again
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12445178/submit-mapreduce-1783.patch
        against trunk revision 959509.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/279/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/279/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/279/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/279/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12445178/submit-mapreduce-1783.patch against trunk revision 959509. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/279/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/279/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/279/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/279/console This message is automatically generated.
        Hide
        Ramkumar Vadali added a comment -

        Will submit an up-to-date patch.

        Show
        Ramkumar Vadali added a comment - Will submit an up-to-date patch.
        Hide
        Ramkumar Vadali added a comment -

        Patch after svn up

        Show
        Ramkumar Vadali added a comment - Patch after svn up
        Hide
        Ramkumar Vadali added a comment -

        Latest patch TEST RESULTS:

        One test fails, but that also fails on a clean checkout

        [junit] Test org.apache.hadoop.mapred.TestControlledMapReduceJob FAILED (timeout)
        

        ant test-patch succeeds:

             [exec] 
             [exec] 
             [exec] +1 overall.  
             [exec] 
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec] 
             [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
             [exec] 
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec] 
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec] 
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.
             [exec] 
             [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
             [exec] 
             [exec]     +1 system test framework.  The patch passed system test framework compile.
             [exec] 
             [exec] 
             [exec] 
             [exec] 
             [exec] ======================================================================
             [exec] ======================================================================
             [exec]     Finished build.
             [exec] ======================================================================
             [exec] ======================================================================
             [exec] 
             [exec] 
        
        BUILD SUCCESSFUL
        Total time: 13 minutes 6 seconds
        Test results are in /tmp/rvadali.hadoopQA
        
        
        Show
        Ramkumar Vadali added a comment - Latest patch TEST RESULTS: One test fails, but that also fails on a clean checkout [junit] Test org.apache.hadoop.mapred.TestControlledMapReduceJob FAILED (timeout) ant test-patch succeeds: [exec] [exec] [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 system test framework. The patch passed system test framework compile. [exec] [exec] [exec] [exec] [exec] ====================================================================== [exec] ====================================================================== [exec] Finished build. [exec] ====================================================================== [exec] ====================================================================== [exec] [exec] BUILD SUCCESSFUL Total time: 13 minutes 6 seconds Test results are in /tmp/rvadali.hadoopQA
        Hide
        Scott Chen added a comment -

        +1 The patch looks good. And we have been running this in our production cluster for a while.
        I will commit this later.

        Show
        Scott Chen added a comment - +1 The patch looks good. And we have been running this in our production cluster for a while. I will commit this later.
        Hide
        Scott Chen added a comment -

        I just committed this. Thanks Ram.

        Show
        Scott Chen added a comment - I just committed this. Thanks Ram.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #557 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/557/)
        MAPREDUCE-1783. FairScheduler initializes tasks only when the job can be run.
        (Ramkumar Vadali via schen)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #557 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/557/ ) MAPREDUCE-1783 . FairScheduler initializes tasks only when the job can be run. (Ramkumar Vadali via schen)
        Hide
        Priyo Mustafi added a comment -

        Hi Scott,
        Can you commit this on 0.22 as well?

        Thanks

        Show
        Priyo Mustafi added a comment - Hi Scott, Can you commit this on 0.22 as well? Thanks
        Hide
        Scott Chen added a comment -

        I just committed this to 0.22.

        Show
        Scott Chen added a comment - I just committed this to 0.22.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-22-branch #33 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-22-branch/33/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-22-branch #33 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-22-branch/33/ )
        Hide
        Konstantin Boudnik added a comment -

        Looks like when the change has been committed to 0.22 the CHANGE.txt file was updated improperly (added to IMPROVEMENT section instead of BUG FIXES) which cases problems now for downstream merges. Also, the description of the JIRA has been written to CHANGES.txt differently from what it say on the ticket ;(

        Please fix.

        Show
        Konstantin Boudnik added a comment - Looks like when the change has been committed to 0.22 the CHANGE.txt file was updated improperly (added to IMPROVEMENT section instead of BUG FIXES) which cases problems now for downstream merges. Also, the description of the JIRA has been written to CHANGES.txt differently from what it say on the ticket ;( Please fix.
        Hide
        Konstantin Boudnik added a comment -

        I have fixed it.

        Show
        Konstantin Boudnik added a comment - I have fixed it.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-22-branch #38 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-22-branch/38/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-22-branch #38 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-22-branch/38/ )
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #643 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #643 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/ )

          People

          • Assignee:
            Ramkumar Vadali
            Reporter:
            Ramkumar Vadali
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development