Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4087

[Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.1.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Fixes the issue of GenerateDistCacheData job slowness.

      Description

      In map() method of GenerateDistCacheData job of Gridmix, val.setSize() is done every time based on the bytes to be written to a distributed cache file. When we try to write data to next distributed cache file in the same map task, the size of random data generated in each iteration can become small based on the particular case. This can make this dist cache data generation slow.

      1. 4087.patch
        1 kB
        Ravi Gummadi
      2. 4087.trunk.patch
        1 kB
        Ravi Gummadi

        Activity

        Allen Wittenauer made changes -
        Fix Version/s 3.0.0 [ 12320355 ]
        Fix Version/s 2.1.0-beta [ 12324032 ]
        Thomas Graves made changes -
        Fix Version/s 2.0.5-beta [ 12324032 ]
        Hide
        Thomas Graves added a comment -

        I merged this to branch-2

        Show
        Thomas Graves added a comment - I merged this to branch-2
        Matt Foley made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Hide
        Matt Foley added a comment -

        Closed upon release of Hadoop-1.1.0.

        Show
        Matt Foley added a comment - Closed upon release of Hadoop-1.1.0.
        Matt Foley made changes -
        Fix Version/s 1.1.0 [ 12317960 ]
        Fix Version/s 3.0.0 [ 12320355 ]
        Hide
        Matt Foley added a comment -

        Based on @Ravi: 31/Mar/12 09:29 "I just committed this to trunk and branch-1."
        Marking this fixed in 3.0.0 and 1.1.0.

        Show
        Matt Foley added a comment - Based on @Ravi: 31/Mar/12 09:29 "I just committed this to trunk and branch-1." Marking this fixed in 3.0.0 and 1.1.0.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #1036 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1036/)
        MAPREDUCE-4087. [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740)

        Result = FAILURE
        ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1036 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1036/ ) MAPREDUCE-4087 . [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740) Result = FAILURE ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #1001 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1001/)
        MAPREDUCE-4087. [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740)

        Result = FAILURE
        ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1001 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1001/ ) MAPREDUCE-4087 . [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740) Result = FAILURE ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #1970 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1970/)
        MAPREDUCE-4087. [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740)

        Result = ABORTED
        ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #1970 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1970/ ) MAPREDUCE-4087 . [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740) Result = ABORTED ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #2032 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2032/)
        MAPREDUCE-4087. [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740)

        Result = SUCCESS
        ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #2032 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2032/ ) MAPREDUCE-4087 . [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740) Result = SUCCESS ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk-Commit #1957 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1957/)
        MAPREDUCE-4087. [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740)

        Result = SUCCESS
        ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #1957 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1957/ ) MAPREDUCE-4087 . [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740) Result = SUCCESS ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Ravi Gummadi made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Release Note Fixes the issue of Generate Dist Cache Data generation job slowness. Fixes the issue of GenerateDistCacheData job slowness.
        Resolution Fixed [ 1 ]
        Hide
        Ravi Gummadi added a comment -

        As this fix improves runtime of GenerateDistCacheData job, adding unit test for this seems to be not simple. So not adding unit test.

        Gridmix unit tests passed on my local machine.

        I just committed this to trunk and branch-1.

        Show
        Ravi Gummadi added a comment - As this fix improves runtime of GenerateDistCacheData job, adding unit test for this seems to be not simple. So not adding unit test. Gridmix unit tests passed on my local machine. I just committed this to trunk and branch-1.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12520595/4087.trunk.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2116//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2116//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520595/4087.trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2116//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2116//console This message is automatically generated.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12520592/4087.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2115//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2115//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520592/4087.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2115//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2115//console This message is automatically generated.
        Ravi Gummadi made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hadoop Flags Reviewed [ 10343 ]
        Ravi Gummadi made changes -
        Attachment 4087.trunk.patch [ 12520595 ]
        Hide
        Ravi Gummadi added a comment -

        Attaching patch for trunk.

        Show
        Ravi Gummadi added a comment - Attaching patch for trunk.
        Ravi Gummadi made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Amar Kamat added a comment -

        Looks good to me. +1

        Show
        Amar Kamat added a comment - Looks good to me. +1
        Hide
        Ravi Gummadi added a comment -

        Attached patch is for branch-1.

        Show
        Ravi Gummadi added a comment - Attached patch is for branch-1.
        Ravi Gummadi made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Release Note Fixes the issue of Generate Dist Cache Data generation job slowness.
        Ravi Gummadi made changes -
        Field Original Value New Value
        Attachment 4087.patch [ 12520592 ]
        Hide
        Ravi Gummadi added a comment -

        Attaching patch with the fix.

        Show
        Ravi Gummadi added a comment - Attaching patch with the fix.
        Ravi Gummadi created issue -

          People

          • Assignee:
            Ravi Gummadi
            Reporter:
            Ravi Gummadi
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development