MRUnit is a tool to help authors of MapReduce programs write unit tests.
Testing map() and reduce() methods requires some repeated work to mock the inputs and outputs of a Mapper or Reducer class, and ensure that the correct values are emitted to the OutputCollector based on inputs. Also, testing a mapper and reducer together requires running them with the sorted ordering guarantees made by the shuffle process.
This library provides the above functionality to authors of maps and reduces; it allows you to test maps, reduces, and map-reduce pairs without needing to perform all the setup and teardown work associated with running a job.
I believe this tool may be useful to the broader Hadoop community, so I have cleaned it up and would like to see it become a "contrib" module. My current environment is based on Hadoop 0.18, so this is the format it expects to use. It does not have support for the new Context-based interfaces for mappers/reducers.
I have attached the overview.html file for its javadoc, which provides more synopsis and an example of usage; I am also providing the current source code so that you can evaluate its structure.
Ideally with some feedback from the community this will move toward supporting the current trunk interface soon.
This currently works with JUnit 4; the supplied patch changes Ivy's libraries.properties file to use JUnit 4.5. I'm marking
HADOOP-4901 as a dependency for this reason.