Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.9.0
-
None
Description
MapReduceDriver calls the Mapper#run method for each input, causing the Mapper#cleanup method to be called multiple times.
I believe this is a bug, since the contract in MapReduce is that, for a single Mapper instance, the Mapper#cleanup method is only called once after all inputs to that mapper have been processed. I might be mistaken in my assumption here.
This would not be an issue, were it not for the fact that MapReduceDriver has only a single instance of Mapper.
One solution might be to pass the Mapper class into the MapReduceDriver and create a new instance for each input. Another solution might be to call the MapDriver with multiple inputs (which AFAIK is not possible).
See attached patch for an example of a stateful mapper and a test which fails due to the bug.
Attachments
Attachments
Issue Links
- is related to
-
MRUNIT-64 Multiple Input Key, Value Pairs should be supported
- Resolved