Details

    • Hadoop Flags:
      Reviewed
    • Release Note:
      Improves cumulative CPU emulation for short running tasks.
    • Tags:
      gridmix emulation cpu

      Description

      CPU emulation in Gridmix fails to meet the expected target if the map has no data to sort/spill/merge. There are 2 major reasons for this:
      1. The map task end immediately ends soon after the map task. The map progress is 67% while the map phase ends.
      2. Currently, the sort (comparator) doesnt emulate CPU. If the map is short lived, the CPU emulation thread (spawned from the map task in cleanup) doesn't get a chance to emulate.

      1. mapreduce-2591-v1.7-0.20-security.patch
        15 kB
        Amar Kamat
      2. mapreduce-2591-v1.7.patch
        15 kB
        Amar Kamat
      3. mapreduce-2591-v1.4.2.patch
        15 kB
        Amar Kamat

        Activity

        Hide
        Thomas Graves added a comment -

        merged back to branch-2

        Show
        Thomas Graves added a comment - merged back to branch-2
        Hide
        Matt Foley added a comment -

        Closed upon release of Hadoop-1.1.0.

        Show
        Matt Foley added a comment - Closed upon release of Hadoop-1.1.0.
        Hide
        Amar Kamat added a comment -

        I just committed the backported patch to branch-0.20-security. Thanks Vinay!

        Show
        Amar Kamat added a comment - I just committed the backported patch to branch-0.20-security. Thanks Vinay!
        Hide
        Vinay Kumar Thota added a comment -

        +1 for the patch.

        Show
        Vinay Kumar Thota added a comment - +1 for the patch.
        Hide
        Amar Kamat added a comment -

        Backported patch for the branch-0.20-security branch. System tests passed. Our analysis shows improvements in emulation. I couldn't run test-patch and JUnit tests as the code compilation fails because of task-controller.

        Show
        Amar Kamat added a comment - Backported patch for the branch-0.20-security branch. System tests passed. Our analysis shows improvements in emulation. I couldn't run test-patch and JUnit tests as the code compilation fails because of task-controller.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #853 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/853/)
        MAPREDUCE-3008. Improvements to cumulative CPU emulation for short running tasks in Gridmix. (amarrk)

        amarrk : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1179933
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/LoadJob.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/CumulativeCpuUsageEmulatorPlugin.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageEmulatorPlugin.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageMatcher.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/TotalHeapUsageEmulatorPlugin.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestResourceUsageEmulators.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #853 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/853/ ) MAPREDUCE-3008 . Improvements to cumulative CPU emulation for short running tasks in Gridmix. (amarrk) amarrk : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1179933 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/LoadJob.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/CumulativeCpuUsageEmulatorPlugin.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageEmulatorPlugin.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageMatcher.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/TotalHeapUsageEmulatorPlugin.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestResourceUsageEmulators.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #823 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/823/)
        MAPREDUCE-3008. Improvements to cumulative CPU emulation for short running tasks in Gridmix. (amarrk)

        amarrk : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1179933
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/LoadJob.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/CumulativeCpuUsageEmulatorPlugin.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageEmulatorPlugin.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageMatcher.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/TotalHeapUsageEmulatorPlugin.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestResourceUsageEmulators.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #823 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/823/ ) MAPREDUCE-3008 . Improvements to cumulative CPU emulation for short running tasks in Gridmix. (amarrk) amarrk : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1179933 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/LoadJob.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/CumulativeCpuUsageEmulatorPlugin.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageEmulatorPlugin.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageMatcher.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/TotalHeapUsageEmulatorPlugin.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestResourceUsageEmulators.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #1048 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1048/)
        MAPREDUCE-3008. Improvements to cumulative CPU emulation for short running tasks in Gridmix. (amarrk)

        amarrk : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1179933
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/LoadJob.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/CumulativeCpuUsageEmulatorPlugin.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageEmulatorPlugin.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageMatcher.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/TotalHeapUsageEmulatorPlugin.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestResourceUsageEmulators.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #1048 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1048/ ) MAPREDUCE-3008 . Improvements to cumulative CPU emulation for short running tasks in Gridmix. (amarrk) amarrk : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1179933 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/LoadJob.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/CumulativeCpuUsageEmulatorPlugin.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageEmulatorPlugin.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageMatcher.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/TotalHeapUsageEmulatorPlugin.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestResourceUsageEmulators.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #1108 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1108/)
        MAPREDUCE-3008. Improvements to cumulative CPU emulation for short running tasks in Gridmix. (amarrk)

        amarrk : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1179933
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/LoadJob.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/CumulativeCpuUsageEmulatorPlugin.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageEmulatorPlugin.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageMatcher.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/TotalHeapUsageEmulatorPlugin.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestResourceUsageEmulators.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #1108 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1108/ ) MAPREDUCE-3008 . Improvements to cumulative CPU emulation for short running tasks in Gridmix. (amarrk) amarrk : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1179933 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/LoadJob.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/CumulativeCpuUsageEmulatorPlugin.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageEmulatorPlugin.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageMatcher.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/TotalHeapUsageEmulatorPlugin.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestResourceUsageEmulators.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk-Commit #1030 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1030/)
        MAPREDUCE-3008. Improvements to cumulative CPU emulation for short running tasks in Gridmix. (amarrk)

        amarrk : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1179933
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/LoadJob.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/CumulativeCpuUsageEmulatorPlugin.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageEmulatorPlugin.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageMatcher.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/TotalHeapUsageEmulatorPlugin.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestResourceUsageEmulators.java
        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #1030 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1030/ ) MAPREDUCE-3008 . Improvements to cumulative CPU emulation for short running tasks in Gridmix. (amarrk) amarrk : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1179933 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/LoadJob.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/CumulativeCpuUsageEmulatorPlugin.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageEmulatorPlugin.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/ResourceUsageMatcher.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/emulators/resourceusage/TotalHeapUsageEmulatorPlugin.java /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestResourceUsageEmulators.java
        Hide
        Amar Kamat added a comment -

        I just committed this to trunk! Thanks Ravi for helping with the reviews.

        Show
        Amar Kamat added a comment - I just committed this to trunk! Thanks Ravi for helping with the reviews.
        Hide
        Ravi Gummadi added a comment -

        Patch looks good to me.
        +1

        Show
        Ravi Gummadi added a comment - Patch looks good to me. +1
        Hide
        Amar Kamat added a comment -

        Attaching a new patch incorporating Ravi's offline comments.

        test-patch passed:

        +1 overall.  
        
            +1 @author.  The patch does not contain any @author tags.
        
            +1 tests included.  The patch appears to include 3 new or modified tests.
        
            +1 javadoc.  The javadoc tool did not generate any warning messages.
        
            +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
        
            +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.
        
            +1 release audit.  The applied patch does not increase the total number of release audit warnings.
        

        All the JUnit tests in test-contrib passed except TestUserResolve, which is a known issue (MAPREDUCE-2950).

        Show
        Amar Kamat added a comment - Attaching a new patch incorporating Ravi's offline comments. test-patch passed: +1 overall. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. All the JUnit tests in test-contrib passed except TestUserResolve, which is a known issue ( MAPREDUCE-2950 ).
        Hide
        Ravi Gummadi added a comment -

        Some comments on the patch:

        (1) Boosting is happening for map-only job's map task progress also. This can cause the progress value to become 1.0+0.33 ? Please check this case while boosting progress.

        (2) The name of the variable "lastSeenCpuUsageCpuUsage" needs to be changed to "lastSeenCpuUsage".

        Show
        Ravi Gummadi added a comment - Some comments on the patch: (1) Boosting is happening for map-only job's map task progress also. This can cause the progress value to become 1.0+0.33 ? Please check this case while boosting progress. (2) The name of the variable "lastSeenCpuUsageCpuUsage" needs to be changed to "lastSeenCpuUsage".
        Hide
        Amar Kamat added a comment -

        Attaching a patch that improves CPU emulation for short running tasks. Areas of improvements:
        1. Sorter/Comparator now is CPU emulation aware
        2. For tasks with no spills/merges, aggressive CPU emulation is done.

        tets-patch and JUnit tests for Gridmix passed.

        Show
        Amar Kamat added a comment - Attaching a patch that improves CPU emulation for short running tasks. Areas of improvements: 1. Sorter/Comparator now is CPU emulation aware 2. For tasks with no spills/merges, aggressive CPU emulation is done. tets-patch and JUnit tests for Gridmix passed.

          People

          • Assignee:
            Amar Kamat
            Reporter:
            Amar Kamat
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development