Affects Version/s: 0.9.0
Fix Version/s: 1.0.0
MapReduceDriver calls the Mapper#run method for each input, causing the Mapper#cleanup method to be called multiple times.
I believe this is a bug, since the contract in MapReduce is that, for a single Mapper instance, the Mapper#cleanup method is only called once after all inputs to that mapper have been processed. I might be mistaken in my assumption here.
This would not be an issue, were it not for the fact that MapReduceDriver has only a single instance of Mapper.
One solution might be to pass the Mapper class into the MapReduceDriver and create a new instance for each input. Another solution might be to call the MapDriver with multiple inputs (which AFAIK is not possible).
See attached patch for an example of a stateful mapper and a test which fails due to the bug.