Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-920

MapFileOutputFormat and SequenceFileOutputFormat use incorrect key/value classes in map/reduce tasks

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.11.0
    • Fix Version/s: 0.11.0
    • Component/s: None
    • Labels:
      None

      Description

      Let's assume a job uses different key/value class for the output of map tasks and for the final output of reduce tasks.

      When executing map tasks classes returned from JobConf.getMapOutputKeyClass() / getMapOutputValueClass() should be used, and when executing reduce tasks classes returned from JobConf.gtOutputKeyClass() / getOutputValueClass() should be used.

      Currently both map and reduce tasks will use getMapOutputKeyClass/getMapOutputValueClass when using MapFileOutputFormat, or they will always use getOutputKeyClassgetOutputValueClass when using SequenceFileOutputFormat. This causes exceptions, because Mapper / Reducer implementations will output different key/value classes than expected.

        Attachments

        1. key-value-class.patch
          0.8 kB
          Andrzej Bialecki

          Activity

            People

            • Assignee:
              ab Andrzej Bialecki
              Reporter:
              ab Andrzej Bialecki
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: