Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4087

[Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.1.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Fixes the issue of GenerateDistCacheData job slowness.

      Description

      In map() method of GenerateDistCacheData job of Gridmix, val.setSize() is done every time based on the bytes to be written to a distributed cache file. When we try to write data to next distributed cache file in the same map task, the size of random data generated in each iteration can become small based on the particular case. This can make this dist cache data generation slow.

      1. 4087.trunk.patch
        1 kB
        Ravi Gummadi
      2. 4087.patch
        1 kB
        Ravi Gummadi

        Activity

        Ravi Gummadi created issue -
        Hide
        Ravi Gummadi added a comment -

        Attaching patch with the fix.

        Show
        Ravi Gummadi added a comment - Attaching patch with the fix.
        Ravi Gummadi made changes -
        Field Original Value New Value
        Attachment 4087.patch [ 12520592 ]
        Ravi Gummadi made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Release Note Fixes the issue of Generate Dist Cache Data generation job slowness.
        Hide
        Ravi Gummadi added a comment -

        Attached patch is for branch-1.

        Show
        Ravi Gummadi added a comment - Attached patch is for branch-1.
        Hide
        Amar Kamat added a comment -

        Looks good to me. +1

        Show
        Amar Kamat added a comment - Looks good to me. +1
        Ravi Gummadi made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Ravi Gummadi added a comment -

        Attaching patch for trunk.

        Show
        Ravi Gummadi added a comment - Attaching patch for trunk.
        Ravi Gummadi made changes -
        Attachment 4087.trunk.patch [ 12520595 ]
        Ravi Gummadi made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hadoop Flags Reviewed [ 10343 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12520592/4087.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2115//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2115//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520592/4087.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2115//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2115//console This message is automatically generated.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12520595/4087.trunk.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2116//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2116//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520595/4087.trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2116//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2116//console This message is automatically generated.
        Hide
        Ravi Gummadi added a comment -

        As this fix improves runtime of GenerateDistCacheData job, adding unit test for this seems to be not simple. So not adding unit test.

        Gridmix unit tests passed on my local machine.

        I just committed this to trunk and branch-1.

        Show
        Ravi Gummadi added a comment - As this fix improves runtime of GenerateDistCacheData job, adding unit test for this seems to be not simple. So not adding unit test. Gridmix unit tests passed on my local machine. I just committed this to trunk and branch-1.
        Ravi Gummadi made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Release Note Fixes the issue of Generate Dist Cache Data generation job slowness. Fixes the issue of GenerateDistCacheData job slowness.
        Resolution Fixed [ 1 ]
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk-Commit #1957 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1957/)
        MAPREDUCE-4087. [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740)

        Result = SUCCESS
        ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #1957 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1957/ ) MAPREDUCE-4087 . [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740) Result = SUCCESS ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #2032 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2032/)
        MAPREDUCE-4087. [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740)

        Result = SUCCESS
        ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #2032 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2032/ ) MAPREDUCE-4087 . [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740) Result = SUCCESS ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #1970 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1970/)
        MAPREDUCE-4087. [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740)

        Result = ABORTED
        ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #1970 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1970/ ) MAPREDUCE-4087 . [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740) Result = ABORTED ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #1001 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1001/)
        MAPREDUCE-4087. [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740)

        Result = FAILURE
        ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1001 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1001/ ) MAPREDUCE-4087 . [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740) Result = FAILURE ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #1036 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1036/)
        MAPREDUCE-4087. [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740)

        Result = FAILURE
        ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1036 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1036/ ) MAPREDUCE-4087 . [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases (ravigummadi) (Revision 1307740) Result = FAILURE ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1307740 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
        Hide
        Matt Foley added a comment -

        Based on @Ravi: 31/Mar/12 09:29 "I just committed this to trunk and branch-1."
        Marking this fixed in 3.0.0 and 1.1.0.

        Show
        Matt Foley added a comment - Based on @Ravi: 31/Mar/12 09:29 "I just committed this to trunk and branch-1." Marking this fixed in 3.0.0 and 1.1.0.
        Matt Foley made changes -
        Fix Version/s 1.1.0 [ 12317960 ]
        Fix Version/s 3.0.0 [ 12320355 ]
        Hide
        Matt Foley added a comment -

        Closed upon release of Hadoop-1.1.0.

        Show
        Matt Foley added a comment - Closed upon release of Hadoop-1.1.0.
        Matt Foley made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Hide
        Thomas Graves added a comment -

        I merged this to branch-2

        Show
        Thomas Graves added a comment - I merged this to branch-2
        Thomas Graves made changes -
        Fix Version/s 2.0.5-beta [ 12324032 ]
        Allen Wittenauer made changes -
        Fix Version/s 3.0.0 [ 12320355 ]
        Fix Version/s 2.1.0-beta [ 12324032 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Patch Available Patch Available Open Open
        19m 46s 1 Ravi Gummadi 30/Mar/12 12:04
        Open Open Patch Available Patch Available
        5m 52s 2 Ravi Gummadi 30/Mar/12 12:05
        Patch Available Patch Available Resolved Resolved
        21h 23m 1 Ravi Gummadi 31/Mar/12 09:29
        Resolved Resolved Closed Closed
        200d 9h 57m 1 Matt Foley 17/Oct/12 19:27

          People

          • Assignee:
            Ravi Gummadi
            Reporter:
            Ravi Gummadi
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development