Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3252

MR2: Map tasks rewrite data once even if output fits in sort buffer

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.23.0
    • Fix Version/s: 0.23.0
    • Component/s: mrv2, task
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Target Version/s:

      Description

      I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

      1. mr-3252.txt
        3 kB
        Todd Lipcon
      2. mr-3252.txt
        3 kB
        Todd Lipcon

        Activity

          People

          • Assignee:
            Todd Lipcon
            Reporter:
            Todd Lipcon
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development