Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4933

MR1 final merge asks for length of file it just wrote before flushing it

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.1.1
    • Fix Version/s: 1.2.0
    • Component/s: mrv1, task
    • Labels:
      None

      Description

      createKVIterator in ReduceTask contains the following code:

      
                try {
                  Merger.writeFile(rIter, writer, reporter, job);
                  addToMapOutputFilesOnDisk(fs.getFileStatus(outputPath));
                } catch (Exception e) {
                  if (null != outputPath) {
                    fs.delete(outputPath, true);
                  }
                  throw new IOException("Final merge failed", e);
                } finally {
                  if (null != writer) {
                    writer.close();
                  }
                }
      

      Merger#writeFile() does not close the file after writing it, so when fs.getFileStatus() is called on it, it may not return the correct length. This causes bad accounting further down the line, which can lead to map output data being lost.

        Activity

          People

          • Assignee:
            Sandy Ryza
            Reporter:
            Sandy Ryza
          • Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development