Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-920

MapFileOutputFormat and SequenceFileOutputFormat use incorrect key/value classes in map/reduce tasks

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.11.0
    • 0.11.0
    • None
    • None

    Description

      Let's assume a job uses different key/value class for the output of map tasks and for the final output of reduce tasks.

      When executing map tasks classes returned from JobConf.getMapOutputKeyClass() / getMapOutputValueClass() should be used, and when executing reduce tasks classes returned from JobConf.gtOutputKeyClass() / getOutputValueClass() should be used.

      Currently both map and reduce tasks will use getMapOutputKeyClass/getMapOutputValueClass when using MapFileOutputFormat, or they will always use getOutputKeyClassgetOutputValueClass when using SequenceFileOutputFormat. This causes exceptions, because Mapper / Reducer implementations will output different key/value classes than expected.

      Attachments

        1. key-value-class.patch
          0.8 kB
          Andrzej Bialecki

        Activity

          People

            ab Andrzej Bialecki
            ab Andrzej Bialecki
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: