Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      New contribution MRUnit helps authors of map-reduce programs write unit tests with JUnit.

      Description

      MRUnit is a tool to help authors of MapReduce programs write unit tests.

      Testing map() and reduce() methods requires some repeated work to mock the inputs and outputs of a Mapper or Reducer class, and ensure that the correct values are emitted to the OutputCollector based on inputs. Also, testing a mapper and reducer together requires running them with the sorted ordering guarantees made by the shuffle process.

      This library provides the above functionality to authors of maps and reduces; it allows you to test maps, reduces, and map-reduce pairs without needing to perform all the setup and teardown work associated with running a job.

      I believe this tool may be useful to the broader Hadoop community, so I have cleaned it up and would like to see it become a "contrib" module. My current environment is based on Hadoop 0.18, so this is the format it expects to use. It does not have support for the new Context-based interfaces for mappers/reducers.

      I have attached the overview.html file for its javadoc, which provides more synopsis and an example of usage; I am also providing the current source code so that you can evaluate its structure.

      Ideally with some feedback from the community this will move toward supporting the current trunk interface soon.

      This currently works with JUnit 4; the supplied patch changes Ivy's libraries.properties file to use JUnit 4.5. I'm marking HADOOP-4901 as a dependency for this reason.

      1. overview.html
        4 kB
        Aaron Kimball
      2. mrunit.patch
        101 kB
        Aaron Kimball
      3. HADOOP-5518-branch18.patch
        99 kB
        Aaron Kimball
      4. HADOOP-5518-3.patch
        103 kB
        Aaron Kimball
      5. HADOOP-5518-2.patch
        101 kB
        Aaron Kimball

        Issue Links

          Activity

          Hide
          Robert Chansler added a comment -

          Editorial pass over all release notes prior to publication of 0.21.

          Show
          Robert Chansler added a comment - Editorial pass over all release notes prior to publication of 0.21.
          Hide
          Aaron Kimball added a comment -

          Previous version was not the correct file; this is the correct patch.

          Show
          Aaron Kimball added a comment - Previous version was not the correct file; this is the correct patch.
          Hide
          Aaron Kimball added a comment -

          Here is an 18-branch patch for this that works with 0.18.3.

          Show
          Aaron Kimball added a comment - Here is an 18-branch patch for this that works with 0.18.3.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-trunk #800 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/800/)
          . Add contrib/mrunit, a MapReduce unit test framework. Contributed by Aaron Kimball.

          Show
          Hudson added a comment - Integrated in Hadoop-trunk #800 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/800/ ) . Add contrib/mrunit, a MapReduce unit test framework. Contributed by Aaron Kimball.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Thomas, you are right. We need to update eclipse classpath. Filed HADOOP-5637.

          Show
          Tsz Wo Nicholas Sze added a comment - Thomas, you are right. We need to update eclipse classpath. Filed HADOOP-5637 .
          Hide
          Thomas Sandholm added a comment -

          Seem like this commit broke the automated test-patch script. The eclipse classpath check always fails (gives -1 as seen above for all patches).
          The error can be reproduced by doing a diff on:

          find build/ivy/lib/Hadoop/common/ lib/ src/test/lib/ -name '*.jar' |sort
          and
          sed -n 's@.kind="lib".*path="(.*jar)".@\1@p' < .eclipse.templates/.classpath | sort

          I presume you just need to add the new junit version to the eclipse file 'eclipse.templates/.classpath'. But I am not an eclipse user so I am not sure, I am just trying to submit a completely unrelated patch.

          Show
          Thomas Sandholm added a comment - Seem like this commit broke the automated test-patch script. The eclipse classpath check always fails (gives -1 as seen above for all patches). The error can be reproduced by doing a diff on: find build/ivy/lib/Hadoop/common/ lib/ src/test/lib/ -name '*.jar' |sort and sed -n 's@. kind="lib".*path="(.*jar)". @\1@p' < .eclipse.templates/.classpath | sort I presume you just need to add the new junit version to the eclipse file 'eclipse.templates/.classpath'. But I am not an eclipse user so I am not sure, I am just trying to submit a completely unrelated patch.
          Hide
          Doug Cutting added a comment -

          I just committed this. Thanks, Aaron!

          Show
          Doug Cutting added a comment - I just committed this. Thanks, Aaron!
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12404202/HADOOP-5518-3.patch
          against trunk revision 760502.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 23 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          -1 Eclipse classpath. The patch causes the Eclipse classpath to differ from the contents of the lib directories.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/84/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/84/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/84/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/84/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12404202/HADOOP-5518-3.patch against trunk revision 760502. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 23 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 Eclipse classpath. The patch causes the Eclipse classpath to differ from the contents of the lib directories. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/84/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/84/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/84/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/84/console This message is automatically generated.
          Hide
          steve_l added a comment -

          I am getting ahead of myself, but I'm thinking we should plan to have a redistributable hadoop-test-tools package into which allthis stuff can go. Your code would be the first release.

          Show
          steve_l added a comment - I am getting ahead of myself, but I'm thinking we should plan to have a redistributable hadoop-test-tools package into which allthis stuff can go. Your code would be the first release.
          Hide
          Aaron Kimball added a comment -

          Patch (#3) to fix release audit warnings.

          Show
          Aaron Kimball added a comment - Patch (#3) to fix release audit warnings.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12404186/HADOOP-5518-2.patch
          against trunk revision 760098.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 23 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          -1 Eclipse classpath. The patch causes the Eclipse classpath to differ from the contents of the lib directories.

          -1 release audit. The applied patch generated 662 release audit warnings (more than the trunk's current 660 warnings).

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/81/testReport/
          Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/81/artifact/trunk/current/releaseAuditDiffWarnings.txt
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/81/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/81/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/81/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12404186/HADOOP-5518-2.patch against trunk revision 760098. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 23 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 Eclipse classpath. The patch causes the Eclipse classpath to differ from the contents of the lib directories. -1 release audit. The applied patch generated 662 release audit warnings (more than the trunk's current 660 warnings). +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/81/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/81/artifact/trunk/current/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/81/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/81/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/81/console This message is automatically generated.
          Hide
          Aaron Kimball added a comment -

          Addresses Steve L.'s comment re. ivy dependencies

          Show
          Aaron Kimball added a comment - Addresses Steve L.'s comment re. ivy dependencies
          Hide
          Aaron Kimball added a comment -

          Steve, thanks for your comments. I'll fix the issue re. ivy deps. If you've got other testing tools, it might be good to have a single hadoop-0.x.y-debugging.jar or something which people could use as a single source for these sorts of things. But that's getting a bit ahead of ourselves. Can you say more about what your JUnit stuff does, and/or file a separate JIRA ticket for it?

          Regarding Hudson:

          • The contrib test failures were in the capacity scheduler; these are unrelated to this patch.
          • I can't read the release audit warnings; that link is 404.
          • Can someone please explain what the Eclipse classpath issue is, and how to fix it? Is that a result of depending on JUnit 4.5 instead of 3.8.1?
          Show
          Aaron Kimball added a comment - Steve, thanks for your comments. I'll fix the issue re. ivy deps. If you've got other testing tools, it might be good to have a single hadoop-0.x.y-debugging.jar or something which people could use as a single source for these sorts of things. But that's getting a bit ahead of ourselves. Can you say more about what your JUnit stuff does, and/or file a separate JIRA ticket for it? Regarding Hudson: The contrib test failures were in the capacity scheduler; these are unrelated to this patch. I can't read the release audit warnings; that link is 404. Can someone please explain what the Eclipse classpath issue is, and how to fix it? Is that a result of depending on JUnit 4.5 instead of 3.8.1?
          Hide
          steve_l added a comment -
          1. this could work well with other testing improvments, though it would be preferable to be in sync with trunk
          2. the commons-logging dependency in ivy should be for master and not default to avoid pulling in random dependency clutter
          3. I have some stuff to run JUnit as an MR job; I wonder if we could integrate the stuff into the same redistributable JAR
          Show
          steve_l added a comment - this could work well with other testing improvments, though it would be preferable to be in sync with trunk the commons-logging dependency in ivy should be for master and not default to avoid pulling in random dependency clutter I have some stuff to run JUnit as an MR job; I wonder if we could integrate the stuff into the same redistributable JAR
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12402415/mrunit.patch
          against trunk revision 759398.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 23 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          -1 Eclipse classpath. The patch causes the Eclipse classpath to differ from the contents of the lib directories.

          -1 release audit. The applied patch generated 661 release audit warnings (more than the trunk's current 659 warnings).

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/71/testReport/
          Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/71/artifact/trunk/current/releaseAuditDiffWarnings.txt
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/71/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/71/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/71/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12402415/mrunit.patch against trunk revision 759398. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 23 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 Eclipse classpath. The patch causes the Eclipse classpath to differ from the contents of the lib directories. -1 release audit. The applied patch generated 661 release audit warnings (more than the trunk's current 659 warnings). +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/71/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/71/artifact/trunk/current/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/71/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/71/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/71/console This message is automatically generated.
          Hide
          Aaron Kimball added a comment -

          Initial upload of MRUnit source

          Show
          Aaron Kimball added a comment - Initial upload of MRUnit source

            People

            • Assignee:
              Aaron Kimball
              Reporter:
              Aaron Kimball
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development