Uploaded image for project: 'MRUnit'
  1. MRUnit
  2. MRUNIT-165

MapReduceDriver calls Mapper#cleanup for each input instead of once

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.0
    • Fix Version/s: 1.0.0
    • Labels:
      None

      Description

      MapReduceDriver calls the Mapper#run method for each input, causing the Mapper#cleanup method to be called multiple times.

      I believe this is a bug, since the contract in MapReduce is that, for a single Mapper instance, the Mapper#cleanup method is only called once after all inputs to that mapper have been processed. I might be mistaken in my assumption here.

      This would not be an issue, were it not for the fact that MapReduceDriver has only a single instance of Mapper.

      One solution might be to pass the Mapper class into the MapReduceDriver and create a new instance for each input. Another solution might be to call the MapDriver with multiple inputs (which AFAIK is not possible).

      See attached patch for an example of a stateful mapper and a test which fails due to the bug.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                dbeech Dave Beech
                Reporter:
                bmesh Yoni Ben-Meshulam
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: