Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-18

Under load the shuffle sometimes gets incorrect data

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 0.20.1
    • 0.20.1
    • None
    • None
    • Reviewed
    • This patch adds the mapid and reduceid in the http header of mapoutput when being sent to reduce node. Also validates compressed length, decompressed length, mapid and reduceid from http header at reduce node.

    Description

      While testing HADOOP-5223 under load, we found reduces receiving completely incorrect data. It was often random, but sometimes was the output of the wrong map for the wrong map. It appears to either be a Jetty or JVM bug, but it is clearly happening on the server side. In the HADOOP-5223 code, I added information about the map and reduce that were included and we should add similar protection to 0.20 and trunk.

      Attachments

        1. MR-18.patch
          5 kB
          Ravi Gummadi
        2. MR-18.v1.patch
          6 kB
          Ravi Gummadi
        3. MR-18-0.20.patch
          6 kB
          Ravi Gummadi
        4. MR-18.v1.1.patch
          6 kB
          Ravi Gummadi

        Activity

          People

            ravidotg Ravi Gummadi
            omalley Owen O'Malley
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: