Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3252

MR2: Map tasks rewrite data once even if output fits in sort buffer

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.23.0
    • Fix Version/s: 0.23.0
    • Component/s: mrv2, task
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

        Attachments

        1. mr-3252.txt
          3 kB
          Todd Lipcon
        2. mr-3252.txt
          3 kB
          Todd Lipcon

          Activity

            People

            • Assignee:
              tlipcon Todd Lipcon
              Reporter:
              tlipcon Todd Lipcon
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: