Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4933

MR1 final merge asks for length of file it just wrote before flushing it

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.1.1
    • Fix Version/s: 1.2.0
    • Component/s: mrv1, task
    • Labels:
      None

      Description

      createKVIterator in ReduceTask contains the following code:

                try {
                  Merger.writeFile(rIter, writer, reporter, job);
                  addToMapOutputFilesOnDisk(fs.getFileStatus(outputPath));
                } catch (Exception e) {
                  if (null != outputPath) {
                    fs.delete(outputPath, true);
                  }
                  throw new IOException("Final merge failed", e);
                } finally {
                  if (null != writer) {
                    writer.close();
                  }
                }
      

      Merger#writeFile() does not close the file after writing it, so when fs.getFileStatus() is called on it, it may not return the correct length. This causes bad accounting further down the line, which can lead to map output data being lost.

        Attachments

          Activity

            People

            • Assignee:
              sandyr Sandy Ryza
              Reporter:
              sandyr Sandy Ryza
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: