Uploaded image for project: 'MRUnit'
  1. MRUnit
  2. MRUNIT-165

MapReduceDriver calls Mapper#cleanup for each input instead of once

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.9.0
    • 1.0.0
    • None

    Description

      MapReduceDriver calls the Mapper#run method for each input, causing the Mapper#cleanup method to be called multiple times.

      I believe this is a bug, since the contract in MapReduce is that, for a single Mapper instance, the Mapper#cleanup method is only called once after all inputs to that mapper have been processed. I might be mistaken in my assumption here.

      This would not be an issue, were it not for the fact that MapReduceDriver has only a single instance of Mapper.

      One solution might be to pass the Mapper class into the MapReduceDriver and create a new instance for each input. Another solution might be to call the MapDriver with multiple inputs (which AFAIK is not possible).

      See attached patch for an example of a stateful mapper and a test which fails due to the bug.

      Attachments

        1. reproduce_MRUNIT-165.patch
          3 kB
          Yoni Ben-Meshulam

        Issue Links

          Activity

            People

              dbeech Dave Beech
              bmesh Yoni Ben-Meshulam
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: