[MRUNIT-69] new mrunit api - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Umbrella
Status: In Progress
Priority: Major
Resolution: Unresolved
Affects Version/s: 0.8.1
Fix Version/s: None
Labels:
None

Description

So I am curious what the plan is for the longterm future of MRUNIT?

I think currently MRUNIT is useful for just unit testing a single mapper or reducer but currently there is a void for testing more complicated features such as MultipleInputs, MultipleOutputs, a driver class, counters, among other things. I wonder if instead of adding support to the current MRUNIT framework for these extra features it would more useful to add in hooks to the existing LocalJobRunner and MiniMRCluster classes to provide methods to more easily verify file output from text files, sequence files, etc. This would allow MRUNIT to test driver classes, MultipleInputs, MultipleOutputs, etc. MRUNIT would also then test against the real hadoop code instead of an implementation that mimics hadoop which can miss some bugs such as the ReduceDriver that did not reuse the same object until 0.8.0. MRUNIT would also keep up with new map reduce features instead of us having to implement fake versions of them

I understand that performance would be an issue due to the file I/O but I wonder how fast the LocalJobRunner would be if we wrote a new class that extending FileSystem to allow users to write out fake files to memory and make the LocalJobRunner read from them

Attachments

Sub-Tasks

1.	create new branch for the new api	Resolved	Jim Donofrio
2.	add this branch to jenkins	Resolved	Brock Noland
3.	determine package structure	Open	Unassigned
4.	run a generic Map Reduce driver (via local job runner) and validating the output	Open	Unassigned
5.	merge back into trunk, deprecate old api	Open	Unassigned

Activity

People

Assignee:: Jim Donofrio

Reporter:: Jim Donofrio

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 28/Feb/12 02:29

Updated:: 15/Aug/12 10:57