Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-18

Under load the shuffle sometimes gets incorrect data

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.20.1
    • Fix Version/s: 0.20.1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      This patch adds the mapid and reduceid in the http header of mapoutput when being sent to reduce node. Also validates compressed length, decompressed length, mapid and reduceid from http header at reduce node.

      Description

      While testing HADOOP-5223 under load, we found reduces receiving completely incorrect data. It was often random, but sometimes was the output of the wrong map for the wrong map. It appears to either be a Jetty or JVM bug, but it is clearly happening on the server side. In the HADOOP-5223 code, I added information about the map and reduce that were included and we should add similar protection to 0.20 and trunk.

      1. MR-18.patch
        5 kB
        Ravi Gummadi
      2. MR-18.v1.1.patch
        6 kB
        Ravi Gummadi
      3. MR-18.v1.patch
        6 kB
        Ravi Gummadi
      4. MR-18-0.20.patch
        6 kB
        Ravi Gummadi

        Activity

          People

          • Assignee:
            Ravi Gummadi
            Reporter:
            Owen O'Malley
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development