MRUnit
  1. MRUnit
  2. MRUNIT-165

MapReduceDriver calls Mapper#cleanup for each input instead of once

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.0
    • Fix Version/s: 1.0.0
    • Labels:
      None

      Description

      MapReduceDriver calls the Mapper#run method for each input, causing the Mapper#cleanup method to be called multiple times.

      I believe this is a bug, since the contract in MapReduce is that, for a single Mapper instance, the Mapper#cleanup method is only called once after all inputs to that mapper have been processed. I might be mistaken in my assumption here.

      This would not be an issue, were it not for the fact that MapReduceDriver has only a single instance of Mapper.

      One solution might be to pass the Mapper class into the MapReduceDriver and create a new instance for each input. Another solution might be to call the MapDriver with multiple inputs (which AFAIK is not possible).

      See attached patch for an example of a stateful mapper and a test which fails due to the bug.

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Dave Beech
              Reporter:
              Yoni Ben-Meshulam
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development