Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2064

Tutorial should mention SetMapOutputKeyClass

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 0.21.0
    • None
    • documentation

    Description

      The official tutorial (mapred_tutorial.html) (and all other tutorials I've seen on the web) show a program that has the same datatypes for the key/value pairs emitted by the mapper and by the reducer, and shows a configuration call to Job.setOutput

      {Key,Value}Class but doesn't say that it refers to both the mapper and the reducer. It sounds like it refers to the reducer output. This might be mentioned in the "Job Configuration" section. Here is a possible addition, after the "The Job is used to specify ..." paragraph.

      The job also configures the types of its key/value pairs with setOutputKeyClass(type) andsetOutputValueClass(type), which appy to both the mapper and reducer classes. If the types output by the mapper and reducer are not the same, that should be followed with setMapOutputKeyClass(type) and setMapOutputValueClass(type).

      (I'm assuming that at least a call to setOutput{Key,Value}

      Class is required.)

      Attachments

        Activity

          People

            Unassigned Unassigned
            clarence Clarence Gardner
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: