Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1061

Gridmix unit test should validate input/output bytes

    Details

    • Type: Test Test
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      TestGridmixSubmission currently verifies only that the correct number of jobs have been run. The test should validate the I/O parameters it claims to satisfy.

      1. M1061-2.patch
        13 kB
        Chris Douglas
      2. M1061-1.patch
        13 kB
        Chris Douglas
      3. 1061-0.patch
        10 kB
        Chris Douglas

        Activity

        Hide
        Chris Douglas added a comment -

        Verifies that input/output bytes are within 32k, records within 5 for each task (spec bytes for reduce tasks currently introduce extra intermediate data)

        Show
        Chris Douglas added a comment - Verifies that input/output bytes are within 32k, records within 5 for each task (spec bytes for reduce tasks currently introduce extra intermediate data)
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12421385/1061-0.patch
        against trunk revision 819740.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/6/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/6/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/6/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/6/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12421385/1061-0.patch against trunk revision 819740. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/6/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/6/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/6/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/6/console This message is automatically generated.
        Hide
        Chris Douglas added a comment -

        Updated based on feedback from Hong. Changed the checks to be per-task rather than looking at the whole job. Tolerances are proportional to the number of map/reduce tasks for bytes, 1 record for input/output.

        Also tightened the reduce output bytes and improved reporting for errors during startup.

        Show
        Chris Douglas added a comment - Updated based on feedback from Hong. Changed the checks to be per-task rather than looking at the whole job. Tolerances are proportional to the number of map/reduce tasks for bytes, 1 record for input/output. Also tightened the reduce output bytes and improved reporting for errors during startup.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12421634/M1061-1.patch
        against trunk revision 823227.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/152/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/152/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/152/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/152/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12421634/M1061-1.patch against trunk revision 823227. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/152/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/152/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/152/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/152/console This message is automatically generated.
        Hide
        Chris Douglas added a comment -

        +/- 1 output record for maps accounted for in tolerance

        Show
        Chris Douglas added a comment - +/- 1 output record for maps accounted for in tolerance
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12421657/M1061-2.patch
        against trunk revision 823227.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/153/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/153/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/153/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/153/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12421657/M1061-2.patch against trunk revision 823227. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/153/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/153/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/153/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/153/console This message is automatically generated.
        Hide
        Hong Tang added a comment -

        Patch looks good. The detailed verification is rigorous, which is nice.

        One minor nit: why do we need to set the actual extra bytes and records proportional to nMaps and nReds. Does it make sense that for map output, each map would output at most extra max(1, nReduce/nMaps) records? And for each reducer only one extra record?

        Show
        Hong Tang added a comment - Patch looks good. The detailed verification is rigorous, which is nice. One minor nit: why do we need to set the actual extra bytes and records proportional to nMaps and nReds. Does it make sense that for map output, each map would output at most extra max(1, nReduce/nMaps) records? And for each reducer only one extra record?
        Hide
        Chris Douglas added a comment -

        Thanks for the review

        why do we need to set the actual extra bytes and records proportional to nMaps and nReds

        If the spec expects 0 bytes/records, then the necessary spec data for each reduce needs to be forgiven. The amount of extra data will be proportional to the number of maps/reduces.

        However, this is adjacent to some sloppiness in the map output, where the spec data is not written as part of the output, but rather as overhead. While the special case will still exist, right now it's the case for all jobs. Since the test still needs to tolerate the 0 cases, I was planning to tighten up the shuffle in a separate issue.

        Show
        Chris Douglas added a comment - Thanks for the review why do we need to set the actual extra bytes and records proportional to nMaps and nReds If the spec expects 0 bytes/records, then the necessary spec data for each reduce needs to be forgiven. The amount of extra data will be proportional to the number of maps/reduces. However, this is adjacent to some sloppiness in the map output, where the spec data is not written as part of the output, but rather as overhead. While the special case will still exist, right now it's the case for all jobs. Since the test still needs to tolerate the 0 cases, I was planning to tighten up the shuffle in a separate issue.
        Hide
        Hong Tang added a comment -

        Sure. As long as it is not an obvious oversight, I am fine with tightening up the test in a separate issue.

        Show
        Hong Tang added a comment - Sure. As long as it is not an obvious oversight, I am fine with tightening up the test in a separate issue.
        Hide
        Chris Douglas added a comment -

        I committed this.

        Show
        Chris Douglas added a comment - I committed this.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #80 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/80/)
        . Add unit test validating byte specifications for gridmix jobs.

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #80 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/80/ ) . Add unit test validating byte specifications for gridmix jobs.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #116 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/116/)
        . Add unit test validating byte specifications for gridmix jobs.

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #116 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/116/ ) . Add unit test validating byte specifications for gridmix jobs.

          People

          • Assignee:
            Chris Douglas
            Reporter:
            Chris Douglas
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development